On Evaluating the Adversarial Robustness of Semantic Segmentation Models
- URL: http://arxiv.org/abs/2306.14217v1
- Date: Sun, 25 Jun 2023 11:45:08 GMT
- Title: On Evaluating the Adversarial Robustness of Semantic Segmentation Models
- Authors: Levente Halmosi and Mark Jelasity
- Abstract summary: A number of adversarial training approaches have been proposed as a defense against adversarial perturbation.
We show for the first time that a number of models in previous work that are claimed to be robust are in fact not robust at all.
We then evaluate simple adversarial training algorithms that produce reasonably robust models even under our set of strong attacks.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Achieving robustness against adversarial input perturbation is an important
and intriguing problem in machine learning. In the area of semantic image
segmentation, a number of adversarial training approaches have been proposed as
a defense against adversarial perturbation, but the methodology of evaluating
the robustness of the models is still lacking, compared to image
classification. Here, we demonstrate that, just like in image classification,
it is important to evaluate the models over several different and hard attacks.
We propose a set of gradient based iterative attacks and show that it is
essential to perform a large number of iterations. We include attacks against
the internal representations of the models as well. We apply two types of
attacks: maximizing the error with a bounded perturbation, and minimizing the
perturbation for a given level of error. Using this set of attacks, we show for
the first time that a number of models in previous work that are claimed to be
robust are in fact not robust at all. We then evaluate simple adversarial
training algorithms that produce reasonably robust models even under our set of
strong attacks. Our results indicate that a key design decision to achieve any
robustness is to use only adversarial examples during training. However, this
introduces a trade-off between robustness and accuracy.
Related papers
- MOREL: Enhancing Adversarial Robustness through Multi-Objective Representation Learning [1.534667887016089]
deep neural networks (DNNs) are vulnerable to slight adversarial perturbations.
We show that strong feature representation learning during training can significantly enhance the original model's robustness.
We propose MOREL, a multi-objective feature representation learning approach, encouraging classification models to produce similar features for inputs within the same class, despite perturbations.
arXiv Detail & Related papers (2024-10-02T16:05:03Z) - Adversarial Robustification via Text-to-Image Diffusion Models [56.37291240867549]
Adrial robustness has been conventionally believed as a challenging property to encode for neural networks.
We develop a scalable and model-agnostic solution to achieve adversarial robustness without using any data.
arXiv Detail & Related papers (2024-07-26T10:49:14Z) - Counterfactual Image Generation for adversarially robust and
interpretable Classifiers [1.3859669037499769]
We propose a unified framework leveraging image-to-image translation Generative Adrial Networks (GANs) to produce counterfactual samples.
This is achieved by combining the classifier and discriminator into a single model that attributes real images to their respective classes and flags generated images as "fake"
We show how the model exhibits improved robustness to adversarial attacks, and we show how the discriminator's "fakeness" value serves as an uncertainty measure of the predictions.
arXiv Detail & Related papers (2023-10-01T18:50:29Z) - Towards Reliable Evaluation and Fast Training of Robust Semantic Segmentation Models [47.03411822627386]
We propose several problem-specific novel attacks minimizing different metrics in accuracy and mIoU.
Surprisingly, existing attempts of adversarial training for semantic segmentation models turn out to be weak or even completely non-robust.
We show how recently proposed robust ImageNet backbones can be used to obtain adversarially robust semantic segmentation models with up to six times less training time for PASCAL-VOC and the more challenging ADE20k.
arXiv Detail & Related papers (2023-06-22T14:56:06Z) - Semantic Image Attack for Visual Model Diagnosis [80.36063332820568]
In practice, metric analysis on a specific train and test dataset does not guarantee reliable or fair ML models.
This paper proposes Semantic Image Attack (SIA), a method based on the adversarial attack that provides semantic adversarial images.
arXiv Detail & Related papers (2023-03-23T03:13:04Z) - SegPGD: An Effective and Efficient Adversarial Attack for Evaluating and
Boosting Segmentation Robustness [63.726895965125145]
Deep neural network-based image classifications are vulnerable to adversarial perturbations.
In this work, we propose an effective and efficient segmentation attack method, dubbed SegPGD.
Since SegPGD can create more effective adversarial examples, the adversarial training with our SegPGD can boost the robustness of segmentation models.
arXiv Detail & Related papers (2022-07-25T17:56:54Z) - Clustering Effect of (Linearized) Adversarial Robust Models [60.25668525218051]
We propose a novel understanding of adversarial robustness and apply it on more tasks including domain adaption and robustness boosting.
Experimental evaluations demonstrate the rationality and superiority of our proposed clustering strategy.
arXiv Detail & Related papers (2021-11-25T05:51:03Z) - A Differentiable Language Model Adversarial Attack on Text Classifiers [10.658675415759697]
We propose a new black-box sentence-level attack for natural language processing.
Our method fine-tunes a pre-trained language model to generate adversarial examples.
We show that the proposed attack outperforms competitors on a diverse set of NLP problems for both computed metrics and human evaluation.
arXiv Detail & Related papers (2021-07-23T14:43:13Z) - Differentiable Language Model Adversarial Attacks on Categorical
Sequence Classifiers [0.0]
An adversarial attack paradigm explores various scenarios for the vulnerability of deep learning models.
We use a fine-tuning of a language model for adversarial attacks as a generator of adversarial examples.
Our model works for diverse datasets on bank transactions, electronic health records, and NLP datasets.
arXiv Detail & Related papers (2020-06-19T11:25:36Z) - Fundamental Tradeoffs between Invariance and Sensitivity to Adversarial
Perturbations [65.05561023880351]
Adversarial examples are malicious inputs crafted to induce misclassification.
This paper studies a complementary failure mode, invariance-based adversarial examples.
We show that defenses against sensitivity-based attacks actively harm a model's accuracy on invariance-based attacks.
arXiv Detail & Related papers (2020-02-11T18:50:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.