On the Benefits of Models with Perceptually-Aligned Gradients
- URL: http://arxiv.org/abs/2005.01499v1
- Date: Mon, 4 May 2020 14:05:38 GMT
- Title: On the Benefits of Models with Perceptually-Aligned Gradients
- Authors: Gunjan Aggarwal, Abhishek Sinha, Nupur Kumari, Mayank Singh
- Abstract summary: We show that interpretable and perceptually aligned gradients are present even in models that do not show high robustness to adversarial attacks.
We leverage models with interpretable perceptually-aligned features and show that adversarial training with low max-perturbation bound can improve the performance of models for zero-shot and weakly supervised localization tasks.
- Score: 8.427953227125148
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Adversarial robust models have been shown to learn more robust and
interpretable features than standard trained models. As shown in
[\cite{tsipras2018robustness}], such robust models inherit useful interpretable
properties where the gradient aligns perceptually well with images, and adding
a large targeted adversarial perturbation leads to an image resembling the
target class. We perform experiments to show that interpretable and
perceptually aligned gradients are present even in models that do not show high
robustness to adversarial attacks. Specifically, we perform adversarial
training with attack for different max-perturbation bound. Adversarial training
with low max-perturbation bound results in models that have interpretable
features with only slight drop in performance over clean samples. In this
paper, we leverage models with interpretable perceptually-aligned features and
show that adversarial training with low max-perturbation bound can improve the
performance of models for zero-shot and weakly supervised localization tasks.
Related papers
- Reinforcing Pre-trained Models Using Counterfactual Images [54.26310919385808]
This paper proposes a novel framework to reinforce classification models using language-guided generated counterfactual images.
We identify model weaknesses by testing the model using the counterfactual image dataset.
We employ the counterfactual images as an augmented dataset to fine-tune and reinforce the classification model.
arXiv Detail & Related papers (2024-06-19T08:07:14Z) - Pre-trained Model Guided Fine-Tuning for Zero-Shot Adversarial Robustness [52.9493817508055]
We propose Pre-trained Model Guided Adversarial Fine-Tuning (PMG-AFT) to enhance the model's zero-shot adversarial robustness.
Our approach consistently improves clean accuracy by an average of 8.72%.
arXiv Detail & Related papers (2024-01-09T04:33:03Z) - Interpretable Computer Vision Models through Adversarial Training:
Unveiling the Robustness-Interpretability Connection [0.0]
Interpretability is as essential as robustness when we deploy the models to the real world.
Standard models, compared to robust are more susceptible to adversarial attacks, and their learned representations are less meaningful to humans.
arXiv Detail & Related papers (2023-07-04T13:51:55Z) - Does Saliency-Based Training bring Robustness for Deep Neural Networks
in Image Classification? [0.0]
Black-box nature of Deep Neural Networks impedes a complete understanding of their inner workings.
Online saliency-guided training methods try to highlight the prominent features in the model's output to alleviate this problem.
We quantify the robustness and conclude that despite the well-explained visualizations in the model's output, the salient models suffer from the lower performance against adversarial examples attacks.
arXiv Detail & Related papers (2023-06-28T22:20:19Z) - On Evaluating the Adversarial Robustness of Semantic Segmentation Models [0.0]
A number of adversarial training approaches have been proposed as a defense against adversarial perturbation.
We show for the first time that a number of models in previous work that are claimed to be robust are in fact not robust at all.
We then evaluate simple adversarial training algorithms that produce reasonably robust models even under our set of strong attacks.
arXiv Detail & Related papers (2023-06-25T11:45:08Z) - No One Representation to Rule Them All: Overlapping Features of Training
Methods [12.58238785151714]
High-performing models tend to make similar predictions regardless of training methodology.
Recent work has made very different training techniques, such as large-scale contrastive learning, yield competitively-high accuracy.
We show these models specialize in generalization of the data, leading to higher ensemble performance.
arXiv Detail & Related papers (2021-10-20T21:29:49Z) - Unleashing the Power of Contrastive Self-Supervised Visual Models via
Contrast-Regularized Fine-Tuning [94.35586521144117]
We investigate whether applying contrastive learning to fine-tuning would bring further benefits.
We propose Contrast-regularized tuning (Core-tuning), a novel approach for fine-tuning contrastive self-supervised visual models.
arXiv Detail & Related papers (2021-02-12T16:31:24Z) - Stereopagnosia: Fooling Stereo Networks with Adversarial Perturbations [71.00754846434744]
We show that imperceptible additive perturbations can significantly alter the disparity map.
We show that, when used for adversarial data augmentation, our perturbations result in trained models that are more robust.
arXiv Detail & Related papers (2020-09-21T19:20:09Z) - Orthogonal Deep Models As Defense Against Black-Box Attacks [71.23669614195195]
We study the inherent weakness of deep models in black-box settings where the attacker may develop the attack using a model similar to the targeted model.
We introduce a novel gradient regularization scheme that encourages the internal representation of a deep model to be orthogonal to another.
We verify the effectiveness of our technique on a variety of large-scale models.
arXiv Detail & Related papers (2020-06-26T08:29:05Z) - Regularizers for Single-step Adversarial Training [49.65499307547198]
We propose three types of regularizers that help to learn robust models using single-step adversarial training methods.
Regularizers mitigate the effect of gradient masking by harnessing on properties that differentiate a robust model from that of a pseudo robust model.
arXiv Detail & Related papers (2020-02-03T09:21:04Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.