Can Biases in ImageNet Models Explain Generalization?
- URL: http://arxiv.org/abs/2404.01509v1
- Date: Mon, 1 Apr 2024 22:25:48 GMT
- Title: Can Biases in ImageNet Models Explain Generalization?
- Authors: Paul Gavrikov, Janis Keuper,
- Abstract summary: Generalization is one of the major challenges of current deep learning methods.
For image classification, this manifests in the existence of adversarial attacks, the performance drops on distorted images, and a lack of generalization to concepts such as sketches.
We perform a large-scale study on 48 ImageNet models obtained via different training methods to understand how and if these biases interact with generalization.
- Score: 13.802802975822704
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The robust generalization of models to rare, in-distribution (ID) samples drawn from the long tail of the training distribution and to out-of-training-distribution (OOD) samples is one of the major challenges of current deep learning methods. For image classification, this manifests in the existence of adversarial attacks, the performance drops on distorted images, and a lack of generalization to concepts such as sketches. The current understanding of generalization in neural networks is very limited, but some biases that differentiate models from human vision have been identified and might be causing these limitations. Consequently, several attempts with varying success have been made to reduce these biases during training to improve generalization. We take a step back and sanity-check these attempts. Fixing the architecture to the well-established ResNet-50, we perform a large-scale study on 48 ImageNet models obtained via different training methods to understand how and if these biases - including shape bias, spectral biases, and critical bands - interact with generalization. Our extensive study results reveal that contrary to previous findings, these biases are insufficient to accurately predict the generalization of a model holistically. We provide access to all checkpoints and evaluation code at https://github.com/paulgavrikov/biases_vs_generalization
Related papers
- Relearning Forgotten Knowledge: on Forgetting, Overfit and Training-Free
Ensembles of DNNs [9.010643838773477]
We introduce a novel score for quantifying overfit, which monitors the forgetting rate of deep models on validation data.
We show that overfit can occur with and without a decrease in validation accuracy, and may be more common than previously appreciated.
We use our observations to construct a new ensemble method, based solely on the training history of a single network, which provides significant improvement without any additional cost in training time.
arXiv Detail & Related papers (2023-10-17T09:22:22Z) - An Extended Study of Human-like Behavior under Adversarial Training [11.72025865314187]
We show that adversarial training increases the shift toward shape bias in neural networks.
We also provide a possible explanation for this phenomenon from a frequency perspective.
arXiv Detail & Related papers (2023-03-22T15:47:16Z) - When Neural Networks Fail to Generalize? A Model Sensitivity Perspective [82.36758565781153]
Domain generalization (DG) aims to train a model to perform well in unseen domains under different distributions.
This paper considers a more realistic yet more challenging scenario, namely Single Domain Generalization (Single-DG)
We empirically ascertain a property of a model that correlates strongly with its generalization that we coin as "model sensitivity"
We propose a novel strategy of Spectral Adversarial Data Augmentation (SADA) to generate augmented images targeted at the highly sensitive frequencies.
arXiv Detail & Related papers (2022-12-01T20:15:15Z) - Rethinking Bias Mitigation: Fairer Architectures Make for Fairer Face
Recognition [107.58227666024791]
Face recognition systems are widely deployed in safety-critical applications, including law enforcement.
They exhibit bias across a range of socio-demographic dimensions, such as gender and race.
Previous works on bias mitigation largely focused on pre-processing the training data.
arXiv Detail & Related papers (2022-10-18T15:46:05Z) - Unsupervised Learning of Unbiased Visual Representations [10.871587311621974]
Deep neural networks are known for their inability to learn robust representations when biases exist in the dataset.
We propose a fully unsupervised debiasing framework, consisting of three steps.
We employ state-of-the-art supervised debiasing techniques to obtain an unbiased model.
arXiv Detail & Related papers (2022-04-26T10:51:50Z) - General Greedy De-bias Learning [163.65789778416172]
We propose a General Greedy De-bias learning framework (GGD), which greedily trains the biased models and the base model like gradient descent in functional space.
GGD can learn a more robust base model under the settings of both task-specific biased models with prior knowledge and self-ensemble biased model without prior knowledge.
arXiv Detail & Related papers (2021-12-20T14:47:32Z) - Visual Recognition with Deep Learning from Biased Image Datasets [6.10183951877597]
We show how biasing models can be applied to remedy problems in the context of visual recognition.
Based on the (approximate) knowledge of the biasing mechanisms at work, our approach consists in reweighting the observations.
We propose to use a low dimensional image representation, shared across the image databases.
arXiv Detail & Related papers (2021-09-06T10:56:58Z) - Evading the Simplicity Bias: Training a Diverse Set of Models Discovers
Solutions with Superior OOD Generalization [93.8373619657239]
Neural networks trained with SGD were recently shown to rely preferentially on linearly-predictive features.
This simplicity bias can explain their lack of robustness out of distribution (OOD)
We demonstrate that the simplicity bias can be mitigated and OOD generalization improved.
arXiv Detail & Related papers (2021-05-12T12:12:24Z) - EnD: Entangling and Disentangling deep representations for bias
correction [7.219077740523682]
We propose EnD, a regularization strategy whose aim is to prevent deep models from learning unwanted biases.
In particular, we insert an "information bottleneck" at a certain point of the deep neural network, where we disentangle the information about the bias.
Experiments show that EnD effectively improves the generalization on unbiased test sets.
arXiv Detail & Related papers (2021-03-02T20:55:42Z) - In Search of Robust Measures of Generalization [79.75709926309703]
We develop bounds on generalization error, optimization error, and excess risk.
When evaluated empirically, most of these bounds are numerically vacuous.
We argue that generalization measures should instead be evaluated within the framework of distributional robustness.
arXiv Detail & Related papers (2020-10-22T17:54:25Z) - Learning from Failure: Training Debiased Classifier from Biased
Classifier [76.52804102765931]
We show that neural networks learn to rely on spurious correlation only when it is "easier" to learn than the desired knowledge.
We propose a failure-based debiasing scheme by training a pair of neural networks simultaneously.
Our method significantly improves the training of the network against various types of biases in both synthetic and real-world datasets.
arXiv Detail & Related papers (2020-07-06T07:20:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.