Semantic Image Attack for Visual Model Diagnosis
- URL: http://arxiv.org/abs/2303.13010v1
- Date: Thu, 23 Mar 2023 03:13:04 GMT
- Title: Semantic Image Attack for Visual Model Diagnosis
- Authors: Jinqi Luo, Zhaoning Wang, Chen Henry Wu, Dong Huang, Fernando De la
Torre
- Abstract summary: In practice, metric analysis on a specific train and test dataset does not guarantee reliable or fair ML models.
This paper proposes Semantic Image Attack (SIA), a method based on the adversarial attack that provides semantic adversarial images.
- Score: 80.36063332820568
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In practice, metric analysis on a specific train and test dataset does not
guarantee reliable or fair ML models. This is partially due to the fact that
obtaining a balanced, diverse, and perfectly labeled dataset is typically
expensive, time-consuming, and error-prone. Rather than relying on a carefully
designed test set to assess ML models' failures, fairness, or robustness, this
paper proposes Semantic Image Attack (SIA), a method based on the adversarial
attack that provides semantic adversarial images to allow model diagnosis,
interpretability, and robustness. Traditional adversarial training is a popular
methodology for robustifying ML models against attacks. However, existing
adversarial methods do not combine the two aspects that enable the
interpretation and analysis of the model's flaws: semantic traceability and
perceptual quality. SIA combines the two features via iterative gradient ascent
on a predefined semantic attribute space and the image space. We illustrate the
validity of our approach in three scenarios for keypoint detection and
classification. (1) Model diagnosis: SIA generates a histogram of attributes
that highlights the semantic vulnerability of the ML model (i.e., attributes
that make the model fail). (2) Stronger attacks: SIA generates adversarial
examples with visually interpretable attributes that lead to higher attack
success rates than baseline methods. The adversarial training on SIA improves
the transferable robustness across different gradient-based attacks. (3)
Robustness to imbalanced datasets: we use SIA to augment the underrepresented
classes, which outperforms strong augmentation and re-balancing baselines.
Related papers
- MOREL: Enhancing Adversarial Robustness through Multi-Objective Representation Learning [1.534667887016089]
deep neural networks (DNNs) are vulnerable to slight adversarial perturbations.
We show that strong feature representation learning during training can significantly enhance the original model's robustness.
We propose MOREL, a multi-objective feature representation learning approach, encouraging classification models to produce similar features for inputs within the same class, despite perturbations.
arXiv Detail & Related papers (2024-10-02T16:05:03Z) - Adversarial Robustification via Text-to-Image Diffusion Models [56.37291240867549]
Adrial robustness has been conventionally believed as a challenging property to encode for neural networks.
We develop a scalable and model-agnostic solution to achieve adversarial robustness without using any data.
arXiv Detail & Related papers (2024-07-26T10:49:14Z) - SA-Attack: Improving Adversarial Transferability of Vision-Language
Pre-training Models via Self-Augmentation [56.622250514119294]
In contrast to white-box adversarial attacks, transfer attacks are more reflective of real-world scenarios.
We propose a self-augment-based transfer attack method, termed SA-Attack.
arXiv Detail & Related papers (2023-12-08T09:08:50Z) - Counterfactual Image Generation for adversarially robust and
interpretable Classifiers [1.3859669037499769]
We propose a unified framework leveraging image-to-image translation Generative Adrial Networks (GANs) to produce counterfactual samples.
This is achieved by combining the classifier and discriminator into a single model that attributes real images to their respective classes and flags generated images as "fake"
We show how the model exhibits improved robustness to adversarial attacks, and we show how the discriminator's "fakeness" value serves as an uncertainty measure of the predictions.
arXiv Detail & Related papers (2023-10-01T18:50:29Z) - Doubly Robust Instance-Reweighted Adversarial Training [107.40683655362285]
We propose a novel doubly-robust instance reweighted adversarial framework.
Our importance weights are obtained by optimizing the KL-divergence regularized loss function.
Our proposed approach outperforms related state-of-the-art baseline methods in terms of average robust performance.
arXiv Detail & Related papers (2023-08-01T06:16:18Z) - On Evaluating the Adversarial Robustness of Semantic Segmentation Models [0.0]
A number of adversarial training approaches have been proposed as a defense against adversarial perturbation.
We show for the first time that a number of models in previous work that are claimed to be robust are in fact not robust at all.
We then evaluate simple adversarial training algorithms that produce reasonably robust models even under our set of strong attacks.
arXiv Detail & Related papers (2023-06-25T11:45:08Z) - Adversarial Training for Improving Model Robustness? Look at Both
Prediction and Interpretation [21.594361495948316]
We propose a novel feature-level adversarial training method named FLAT.
FLAT incorporates variational word masks in neural networks to learn global word importance.
Experiments show the effectiveness of FLAT in improving the robustness with respect to both predictions and interpretations.
arXiv Detail & Related papers (2022-03-23T20:04:14Z) - Clustering Effect of (Linearized) Adversarial Robust Models [60.25668525218051]
We propose a novel understanding of adversarial robustness and apply it on more tasks including domain adaption and robustness boosting.
Experimental evaluations demonstrate the rationality and superiority of our proposed clustering strategy.
arXiv Detail & Related papers (2021-11-25T05:51:03Z) - Understanding the Logit Distributions of Adversarially-Trained Deep
Neural Networks [6.439477789066243]
Adversarial defenses train deep neural networks to be invariant to the input perturbations from adversarial attacks.
Although adversarial training is successful at mitigating adversarial attacks, the behavioral differences between adversarially-trained (AT) models and standard models are still poorly understood.
We identify three logit characteristics essential to learning adversarial robustness.
arXiv Detail & Related papers (2021-08-26T19:09:15Z) - Trust but Verify: Assigning Prediction Credibility by Counterfactual
Constrained Learning [123.3472310767721]
Prediction credibility measures are fundamental in statistics and machine learning.
These measures should account for the wide variety of models used in practice.
The framework developed in this work expresses the credibility as a risk-fit trade-off.
arXiv Detail & Related papers (2020-11-24T19:52:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.