Do ImageNet-trained models learn shortcuts? The impact of frequency shortcuts on generalization
- URL: http://arxiv.org/abs/2503.03519v2
- Date: Sat, 22 Mar 2025 14:58:05 GMT
- Title: Do ImageNet-trained models learn shortcuts? The impact of frequency shortcuts on generalization
- Authors: Shunxin Wang, Raymond Veldhuis, Nicola Strisciuglio,
- Abstract summary: Frequency shortcuts refer to specific frequency patterns that models heavily rely on for correct classification.<n>Previous studies have shown that models trained on small image datasets often exploit such shortcuts.<n>We propose the first approach to more efficiently analyze frequency shortcuts at a large scale.
- Score: 2.784223169208081
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Frequency shortcuts refer to specific frequency patterns that models heavily rely on for correct classification. Previous studies have shown that models trained on small image datasets often exploit such shortcuts, potentially impairing their generalization performance. However, existing methods for identifying frequency shortcuts require expensive computations and become impractical for analyzing models trained on large datasets. In this work, we propose the first approach to more efficiently analyze frequency shortcuts at a large scale. We show that both CNN and transformer models learn frequency shortcuts on ImageNet. We also expose that frequency shortcut solutions can yield good performance on out-of-distribution (OOD) test sets which largely retain texture information. However, these shortcuts, mostly aligned with texture patterns, hinder model generalization on rendition-based OOD test sets. These observations suggest that current OOD evaluations often overlook the impact of frequency shortcuts on model generalization. Future benchmarks could thus benefit from explicitly assessing and accounting for these shortcuts to build models that generalize across a broader range of OOD scenarios.
Related papers
- Sharpening Neural Implicit Functions with Frequency Consolidation Priors [53.6277160912059]
Signed Distance Functions (SDFs) are vital implicit representations to represent high fidelity 3D surfaces.<n>Current methods mainly leverage a neural network to learn an SDF from various supervisions including signed, 3D point clouds, or multi-view images.<n>We introduce a method to sharpen a low frequency SDF observation by recovering its high frequency components, pursuing a sharper and more complete surface.
arXiv Detail & Related papers (2024-12-27T16:18:46Z) - Towards Combating Frequency Simplicity-biased Learning for Domain Generalization [36.777767173275336]
Domain generalization methods aim to learn transferable knowledge from source domains that can generalize well to unseen target domains.
Recent studies show that neural networks frequently suffer from a simplicity-biased learning behavior which leads to over-reliance on specific frequency sets.
We propose two effective data augmentation modules designed to collaboratively and adaptively adjust the frequency characteristic of the dataset.
arXiv Detail & Related papers (2024-10-21T16:17:01Z) - Exposing Image Classifier Shortcuts with Counterfactual Frequency (CoF) Tables [0.0]
'Shortcuts' are easy-to-learn patterns from the training data that fail to generalise to new data.<n>Examples include the use of a copyright watermark to recognise horses, snowy background to recognise huskies, or ink markings to detect malignant skin lesions.<n>We introduce Counterfactual Frequency tables, a novel approach that aggregates instance-based explanations into global insights.
arXiv Detail & Related papers (2024-05-24T15:58:02Z) - Robust Collaborative Filtering to Popularity Distribution Shift [56.78171423428719]
We present a simple yet effective debiasing strategy, PopGo, which quantifies and reduces the interaction-wise popularity shortcut without assumptions on the test data.
On both ID and OOD test sets, PopGo achieves significant gains over the state-of-the-art debiasing strategies.
arXiv Detail & Related papers (2023-10-16T04:20:52Z) - Understanding prompt engineering may not require rethinking
generalization [56.38207873589642]
We show that the discrete nature of prompts, combined with a PAC-Bayes prior given by a language model, results in generalization bounds that are remarkably tight by the standards of the literature.
This work provides a possible justification for the widespread practice of prompt engineering.
arXiv Detail & Related papers (2023-10-06T00:52:48Z) - What do neural networks learn in image classification? A frequency
shortcut perspective [3.9858496473361402]
This study empirically investigates the learning dynamics of frequency shortcuts in neural networks (NNs)
We show that NNs tend to find simple solutions for classification, and what they learn first during training depends on the most distinctive frequency characteristics.
We propose a metric to measure class-wise frequency characteristics and a method to identify frequency shortcuts.
arXiv Detail & Related papers (2023-07-19T08:34:25Z) - Why Machine Reading Comprehension Models Learn Shortcuts? [56.629192589376046]
We argue that larger proportion of shortcut questions in training data make models rely on shortcut tricks excessively.
A thorough empirical analysis shows that MRC models tend to learn shortcut questions earlier than challenging questions.
arXiv Detail & Related papers (2021-06-02T08:43:12Z) - Visualising Deep Network's Time-Series Representations [93.73198973454944]
Despite the popularisation of machine learning models, more often than not they still operate as black boxes with no insight into what is happening inside the model.
In this paper, a method that addresses that issue is proposed, with a focus on visualising multi-dimensional time-series data.
Experiments on a high-frequency stock market dataset show that the method provides fast and discernible visualisations.
arXiv Detail & Related papers (2021-03-12T09:53:34Z) - Towards Interpreting and Mitigating Shortcut Learning Behavior of NLU
models [53.36605766266518]
We show that trained NLU models have strong preference for features located at the head of the long-tailed distribution.
We propose a shortcut mitigation framework, to suppress the model from making overconfident predictions for samples with large shortcut degree.
arXiv Detail & Related papers (2021-03-11T19:39:56Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.