An Extended Study of Human-like Behavior under Adversarial Training
- URL: http://arxiv.org/abs/2303.12669v1
- Date: Wed, 22 Mar 2023 15:47:16 GMT
- Title: An Extended Study of Human-like Behavior under Adversarial Training
- Authors: Paul Gavrikov, Janis Keuper, Margret Keuper
- Abstract summary: We show that adversarial training increases the shift toward shape bias in neural networks.
We also provide a possible explanation for this phenomenon from a frequency perspective.
- Score: 11.72025865314187
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural networks have a number of shortcomings. Amongst the severest ones is
the sensitivity to distribution shifts which allows models to be easily fooled
into wrong predictions by small perturbations to inputs that are often
imperceivable to humans and do not have to carry semantic meaning. Adversarial
training poses a partial solution to address this issue by training models on
worst-case perturbations. Yet, recent work has also pointed out that the
reasoning in neural networks is different from humans. Humans identify objects
by shape, while neural nets mainly employ texture cues. Exemplarily, a model
trained on photographs will likely fail to generalize to datasets containing
sketches. Interestingly, it was also shown that adversarial training seems to
favorably increase the shift toward shape bias. In this work, we revisit this
observation and provide an extensive analysis of this effect on various
architectures, the common $\ell_2$- and $\ell_\infty$-training, and
Transformer-based models. Further, we provide a possible explanation for this
phenomenon from a frequency perspective.
Related papers
- Can Biases in ImageNet Models Explain Generalization? [13.802802975822704]
Generalization is one of the major challenges of current deep learning methods.
For image classification, this manifests in the existence of adversarial attacks, the performance drops on distorted images, and a lack of generalization to concepts such as sketches.
We perform a large-scale study on 48 ImageNet models obtained via different training methods to understand how and if these biases interact with generalization.
arXiv Detail & Related papers (2024-04-01T22:25:48Z) - A Dynamical Model of Neural Scaling Laws [79.59705237659547]
We analyze a random feature model trained with gradient descent as a solvable model of network training and generalization.
Our theory shows how the gap between training and test loss can gradually build up over time due to repeated reuse of data.
arXiv Detail & Related papers (2024-02-02T01:41:38Z) - MENTOR: Human Perception-Guided Pretraining for Increased Generalization [5.596752018167751]
We introduce MENTOR (huMan pErceptioN-guided preTraining fOr increased geneRalization)
We train an autoencoder to learn human saliency maps given an input image, without class labels.
We remove the decoder part, add a classification layer on top of the encoder, and fine-tune this new model conventionally.
arXiv Detail & Related papers (2023-10-30T13:50:44Z) - Connecting metrics for shape-texture knowledge in computer vision [1.7785095623975342]
Deep neural networks remain brittle and susceptible to many changes in the image that do not cause humans to misclassify images.
Part of this different behavior may be explained by the type of features humans and deep neural networks use in vision tasks.
arXiv Detail & Related papers (2023-01-25T14:37:42Z) - Benign Overfitting in Two-layer Convolutional Neural Networks [90.75603889605043]
We study the benign overfitting phenomenon in training a two-layer convolutional neural network (CNN)
We show that when the signal-to-noise ratio satisfies a certain condition, a two-layer CNN trained by gradient descent can achieve arbitrarily small training and test loss.
On the other hand, when this condition does not hold, overfitting becomes harmful and the obtained CNN can only achieve constant level test loss.
arXiv Detail & Related papers (2022-02-14T07:45:51Z) - Explainable Adversarial Attacks in Deep Neural Networks Using Activation
Profiles [69.9674326582747]
This paper presents a visual framework to investigate neural network models subjected to adversarial examples.
We show how observing these elements can quickly pinpoint exploited areas in a model.
arXiv Detail & Related papers (2021-03-18T13:04:21Z) - Predictive coding feedback results in perceived illusory contours in a
recurrent neural network [0.0]
We equip a deep feedforward convolutional network with brain-inspired recurrent dynamics.
We show that the perception of illusory contours could involve feedback connections.
arXiv Detail & Related papers (2021-02-03T09:07:09Z) - A Bayesian Perspective on Training Speed and Model Selection [51.15664724311443]
We show that a measure of a model's training speed can be used to estimate its marginal likelihood.
We verify our results in model selection tasks for linear models and for the infinite-width limit of deep neural networks.
Our results suggest a promising new direction towards explaining why neural networks trained with gradient descent are biased towards functions that generalize well.
arXiv Detail & Related papers (2020-10-27T17:56:14Z) - Stereopagnosia: Fooling Stereo Networks with Adversarial Perturbations [71.00754846434744]
We show that imperceptible additive perturbations can significantly alter the disparity map.
We show that, when used for adversarial data augmentation, our perturbations result in trained models that are more robust.
arXiv Detail & Related papers (2020-09-21T19:20:09Z) - Encoding Robustness to Image Style via Adversarial Feature Perturbations [72.81911076841408]
We adapt adversarial training by directly perturbing feature statistics, rather than image pixels, to produce robust models.
Our proposed method, Adversarial Batch Normalization (AdvBN), is a single network layer that generates worst-case feature perturbations during training.
arXiv Detail & Related papers (2020-09-18T17:52:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.