Interpretable Computer Vision Models through Adversarial Training:
Unveiling the Robustness-Interpretability Connection
- URL: http://arxiv.org/abs/2307.02500v2
- Date: Sun, 19 Nov 2023 15:38:50 GMT
- Title: Interpretable Computer Vision Models through Adversarial Training:
Unveiling the Robustness-Interpretability Connection
- Authors: Delyan Boychev
- Abstract summary: Interpretability is as essential as robustness when we deploy the models to the real world.
Standard models, compared to robust are more susceptible to adversarial attacks, and their learned representations are less meaningful to humans.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: With the perpetual increase of complexity of the state-of-the-art deep neural
networks, it becomes a more and more challenging task to maintain their
interpretability. Our work aims to evaluate the effects of adversarial training
utilized to produce robust models - less vulnerable to adversarial attacks. It
has been shown to make computer vision models more interpretable.
Interpretability is as essential as robustness when we deploy the models to the
real world. To prove the correlation between these two problems, we extensively
examine the models using local feature-importance methods (SHAP, Integrated
Gradients) and feature visualization techniques (Representation Inversion,
Class Specific Image Generation). Standard models, compared to robust are more
susceptible to adversarial attacks, and their learned representations are less
meaningful to humans. Conversely, these models focus on distinctive regions of
the images which support their predictions. Moreover, the features learned by
the robust model are closer to the real ones.
Related papers
- Towards Evaluating the Robustness of Visual State Space Models [63.14954591606638]
Vision State Space Models (VSSMs) have demonstrated remarkable performance in visual perception tasks.
However, their robustness under natural and adversarial perturbations remains a critical concern.
We present a comprehensive evaluation of VSSMs' robustness under various perturbation scenarios.
arXiv Detail & Related papers (2024-06-13T17:59:44Z) - Does Saliency-Based Training bring Robustness for Deep Neural Networks
in Image Classification? [0.0]
Black-box nature of Deep Neural Networks impedes a complete understanding of their inner workings.
Online saliency-guided training methods try to highlight the prominent features in the model's output to alleviate this problem.
We quantify the robustness and conclude that despite the well-explained visualizations in the model's output, the salient models suffer from the lower performance against adversarial examples attacks.
arXiv Detail & Related papers (2023-06-28T22:20:19Z) - Robust Models are less Over-Confident [10.42820615166362]
adversarial training (AT) aims to achieve robustness against such attacks.
We empirically analyze a variety of adversarially trained models that achieve high robust accuracies.
AT has an interesting side-effect: it leads to models that are significantly less overconfident with their decisions.
arXiv Detail & Related papers (2022-10-12T06:14:55Z) - Robustness in Deep Learning for Computer Vision: Mind the gap? [13.576376492050185]
We identify, analyze, and summarize current definitions and progress towards non-adversarial robustness in deep learning for computer vision.
We find that this area of research has received disproportionately little attention relative to adversarial machine learning.
arXiv Detail & Related papers (2021-12-01T16:42:38Z) - Harnessing Perceptual Adversarial Patches for Crowd Counting [92.79051296850405]
Crowd counting is vulnerable to adversarial examples in the physical world.
This paper proposes the Perceptual Adrial Patch (PAP) generation framework to learn the shared perceptual features between models.
arXiv Detail & Related papers (2021-09-16T13:51:39Z) - Impact of Attention on Adversarial Robustness of Image Classification
Models [0.9176056742068814]
Adrial attacks against deep learning models have gained significant attention.
Recent works have proposed explanations for the existence of adversarial examples and techniques to defend the models against these attacks.
This work aims at a general understanding of the impact of attention on adversarial robustness.
arXiv Detail & Related papers (2021-09-02T13:26:32Z) - Explainable Adversarial Attacks in Deep Neural Networks Using Activation
Profiles [69.9674326582747]
This paper presents a visual framework to investigate neural network models subjected to adversarial examples.
We show how observing these elements can quickly pinpoint exploited areas in a model.
arXiv Detail & Related papers (2021-03-18T13:04:21Z) - Proactive Pseudo-Intervention: Causally Informed Contrastive Learning
For Interpretable Vision Models [103.64435911083432]
We present a novel contrastive learning strategy called it Proactive Pseudo-Intervention (PPI)
PPI leverages proactive interventions to guard against image features with no causal relevance.
We also devise a novel causally informed salience mapping module to identify key image pixels to intervene, and show it greatly facilitates model interpretability.
arXiv Detail & Related papers (2020-12-06T20:30:26Z) - Stereopagnosia: Fooling Stereo Networks with Adversarial Perturbations [71.00754846434744]
We show that imperceptible additive perturbations can significantly alter the disparity map.
We show that, when used for adversarial data augmentation, our perturbations result in trained models that are more robust.
arXiv Detail & Related papers (2020-09-21T19:20:09Z) - Stylized Adversarial Defense [105.88250594033053]
adversarial training creates perturbation patterns and includes them in the training set to robustify the model.
We propose to exploit additional information from the feature space to craft stronger adversaries.
Our adversarial training approach demonstrates strong robustness compared to state-of-the-art defenses.
arXiv Detail & Related papers (2020-07-29T08:38:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.