Related papers: Can Biases in ImageNet Models Explain Generalization?

Can Biases in ImageNet Models Explain Generalization?

URL: http://arxiv.org/abs/2404.01509v1
Date: Mon, 1 Apr 2024 22:25:48 GMT
Title: Can Biases in ImageNet Models Explain Generalization?
Authors: Paul Gavrikov, Janis Keuper,
Abstract summary: Generalization is one of the major challenges of current deep learning methods. For image classification, this manifests in the existence of adversarial attacks, the performance drops on distorted images, and a lack of generalization to concepts such as sketches. We perform a large-scale study on 48 ImageNet models obtained via different training methods to understand how and if these biases interact with generalization.
Score: 13.802802975822704
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The robust generalization of models to rare, in-distribution (ID) samples drawn from the long tail of the training distribution and to out-of-training-distribution (OOD) samples is one of the major challenges of current deep learning methods. For image classification, this manifests in the existence of adversarial attacks, the performance drops on distorted images, and a lack of generalization to concepts such as sketches. The current understanding of generalization in neural networks is very limited, but some biases that differentiate models from human vision have been identified and might be causing these limitations. Consequently, several attempts with varying success have been made to reduce these biases during training to improve generalization. We take a step back and sanity-check these attempts. Fixing the architecture to the well-established ResNet-50, we perform a large-scale study on 48 ImageNet models obtained via different training methods to understand how and if these biases - including shape bias, spectral biases, and critical bands - interact with generalization. Our extensive study results reveal that contrary to previous findings, these biases are insufficient to accurately predict the generalization of a model holistically. We provide access to all checkpoints and evaluation code at https://github.com/paulgavrikov/biases_vs_generalization

Related papers

A Classical View on Benign Overfitting: The Role of Sample Size [14.36840959836957]
We focus on almost benign overfitting, where models simultaneously achieve both arbitrarily small training and test errors.<n>This behavior is characteristic of neural networks, which often achieve low (but non-zero) training error while still generalizing well.
arXiv Detail & Related papers (2025-05-16T18:37:51Z)
Bayesian Cross-Modal Alignment Learning for Few-Shot Out-of-Distribution Generalization [47.64583975469164]
We introduce a novel cross-modal image-text alignment learning method (Bayes-CAL) to address this issue. Bayes-CAL achieves state-of-the-art OoD generalization performances on two-dimensional distribution shifts. Compared with CLIP-like models, Bayes-CAL yields more stable generalization performances on unseen classes.
arXiv Detail & Related papers (2025-04-13T06:13:37Z)
Exploring Bias in over 100 Text-to-Image Generative Models [49.60774626839712]
We investigate bias trends in text-to-image generative models over time, focusing on the increasing availability of models through open platforms like Hugging Face. We assess bias across three key dimensions: (i) distribution bias, (ii) generative hallucination, and (iii) generative miss-rate. Our findings indicate that artistic and style-transferred models exhibit significant bias, whereas foundation models, benefiting from broader training distributions, are becoming progressively less biased.
arXiv Detail & Related papers (2025-03-11T03:40:44Z)
Relearning Forgotten Knowledge: on Forgetting, Overfit and Training-Free Ensembles of DNNs [9.010643838773477]
We introduce a novel score for quantifying overfit, which monitors the forgetting rate of deep models on validation data. We show that overfit can occur with and without a decrease in validation accuracy, and may be more common than previously appreciated. We use our observations to construct a new ensemble method, based solely on the training history of a single network, which provides significant improvement without any additional cost in training time.
arXiv Detail & Related papers (2023-10-17T09:22:22Z)
An Extended Study of Human-like Behavior under Adversarial Training [11.72025865314187]
We show that adversarial training increases the shift toward shape bias in neural networks. We also provide a possible explanation for this phenomenon from a frequency perspective.
arXiv Detail & Related papers (2023-03-22T15:47:16Z)
When Neural Networks Fail to Generalize? A Model Sensitivity Perspective [82.36758565781153]
Domain generalization (DG) aims to train a model to perform well in unseen domains under different distributions. This paper considers a more realistic yet more challenging scenario, namely Single Domain Generalization (Single-DG) We empirically ascertain a property of a model that correlates strongly with its generalization that we coin as "model sensitivity" We propose a novel strategy of Spectral Adversarial Data Augmentation (SADA) to generate augmented images targeted at the highly sensitive frequencies.
arXiv Detail & Related papers (2022-12-01T20:15:15Z)
Rethinking Bias Mitigation: Fairer Architectures Make for Fairer Face Recognition [107.58227666024791]
Face recognition systems are widely deployed in safety-critical applications, including law enforcement. They exhibit bias across a range of socio-demographic dimensions, such as gender and race. Previous works on bias mitigation largely focused on pre-processing the training data.
arXiv Detail & Related papers (2022-10-18T15:46:05Z)
Unsupervised Learning of Unbiased Visual Representations [10.871587311621974]
Deep neural networks are known for their inability to learn robust representations when biases exist in the dataset. We propose a fully unsupervised debiasing framework, consisting of three steps. We employ state-of-the-art supervised debiasing techniques to obtain an unbiased model.
arXiv Detail & Related papers (2022-04-26T10:51:50Z)
General Greedy De-bias Learning [163.65789778416172]
We propose a General Greedy De-bias learning framework (GGD), which greedily trains the biased models and the base model like gradient descent in functional space. GGD can learn a more robust base model under the settings of both task-specific biased models with prior knowledge and self-ensemble biased model without prior knowledge.
arXiv Detail & Related papers (2021-12-20T14:47:32Z)
Visual Recognition with Deep Learning from Biased Image Datasets [6.10183951877597]
We show how biasing models can be applied to remedy problems in the context of visual recognition. Based on the (approximate) knowledge of the biasing mechanisms at work, our approach consists in reweighting the observations. We propose to use a low dimensional image representation, shared across the image databases.
arXiv Detail & Related papers (2021-09-06T10:56:58Z)
Evading the Simplicity Bias: Training a Diverse Set of Models Discovers Solutions with Superior OOD Generalization [93.8373619657239]
Neural networks trained with SGD were recently shown to rely preferentially on linearly-predictive features. This simplicity bias can explain their lack of robustness out of distribution (OOD) We demonstrate that the simplicity bias can be mitigated and OOD generalization improved.
arXiv Detail & Related papers (2021-05-12T12:12:24Z)
EnD: Entangling and Disentangling deep representations for bias correction [7.219077740523682]
We propose EnD, a regularization strategy whose aim is to prevent deep models from learning unwanted biases. In particular, we insert an "information bottleneck" at a certain point of the deep neural network, where we disentangle the information about the bias. Experiments show that EnD effectively improves the generalization on unbiased test sets.
arXiv Detail & Related papers (2021-03-02T20:55:42Z)
In Search of Robust Measures of Generalization [79.75709926309703]
We develop bounds on generalization error, optimization error, and excess risk. When evaluated empirically, most of these bounds are numerically vacuous. We argue that generalization measures should instead be evaluated within the framework of distributional robustness.
arXiv Detail & Related papers (2020-10-22T17:54:25Z)
Learning from Failure: Training Debiased Classifier from Biased Classifier [76.52804102765931]
We show that neural networks learn to rely on spurious correlation only when it is "easier" to learn than the desired knowledge. We propose a failure-based debiasing scheme by training a pair of neural networks simultaneously. Our method significantly improves the training of the network against various types of biases in both synthetic and real-world datasets.
arXiv Detail & Related papers (2020-07-06T07:20:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.