Unleashing the power of Neural Collapse for Transferability Estimation
- URL: http://arxiv.org/abs/2310.05754v1
- Date: Mon, 9 Oct 2023 14:30:10 GMT
- Title: Unleashing the power of Neural Collapse for Transferability Estimation
- Authors: Yuhe Ding, Bo Jiang, Lijun Sheng, Aihua Zheng, Jian Liang
- Abstract summary: Well-trained models exhibit the phenomenon of Neural Collapse.
We propose a novel method termed Fair Collapse (FaCe) for transferability estimation.
FaCe yields state-of-the-art performance on different tasks including image classification, semantic segmentation, and text classification.
- Score: 42.09673383041276
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Transferability estimation aims to provide heuristics for quantifying how
suitable a pre-trained model is for a specific downstream task, without
fine-tuning them all. Prior studies have revealed that well-trained models
exhibit the phenomenon of Neural Collapse. Based on a widely used neural
collapse metric in existing literature, we observe a strong correlation between
the neural collapse of pre-trained models and their corresponding fine-tuned
models. Inspired by this observation, we propose a novel method termed Fair
Collapse (FaCe) for transferability estimation by comprehensively measuring the
degree of neural collapse in the pre-trained model. Typically, FaCe comprises
two different terms: the variance collapse term, which assesses the class
separation and within-class compactness, and the class fairness term, which
quantifies the fairness of the pre-trained model towards each class. We
investigate FaCe on a variety of pre-trained classification models across
different network architectures, source datasets, and training loss functions.
Results show that FaCe yields state-of-the-art performance on different tasks
including image classification, semantic segmentation, and text classification,
which demonstrate the effectiveness and generalization of our method.
Related papers
- Towards Theoretical Understandings of Self-Consuming Generative Models [56.84592466204185]
This paper tackles the emerging challenge of training generative models within a self-consuming loop.
We construct a theoretical framework to rigorously evaluate how this training procedure impacts the data distributions learned by future models.
We present results for kernel density estimation, delivering nuanced insights such as the impact of mixed data training on error propagation.
arXiv Detail & Related papers (2024-02-19T02:08:09Z) - A PAC-Bayesian Perspective on the Interpolating Information Criterion [54.548058449535155]
We show how a PAC-Bayes bound is obtained for a general class of models, characterizing factors which influence performance in the interpolating regime.
We quantify how the test error for overparameterized models achieving effectively zero training error depends on the quality of the implicit regularization imposed by e.g. the combination of model, parameter-initialization scheme.
arXiv Detail & Related papers (2023-11-13T01:48:08Z) - FD-Align: Feature Discrimination Alignment for Fine-tuning Pre-Trained
Models in Few-Shot Learning [21.693779973263172]
In this paper, we introduce a fine-tuning approach termed Feature Discrimination Alignment (FD-Align)
Our method aims to bolster the model's generalizability by preserving the consistency of spurious features.
Once fine-tuned, the model can seamlessly integrate with existing methods, leading to performance improvements.
arXiv Detail & Related papers (2023-10-23T17:12:01Z) - TWINS: A Fine-Tuning Framework for Improved Transferability of
Adversarial Robustness and Generalization [89.54947228958494]
This paper focuses on the fine-tuning of an adversarially pre-trained model in various classification tasks.
We propose a novel statistics-based approach, Two-WIng NormliSation (TWINS) fine-tuning framework.
TWINS is shown to be effective on a wide range of image classification datasets in terms of both generalization and robustness.
arXiv Detail & Related papers (2023-03-20T14:12:55Z) - Understanding and Improving Transfer Learning of Deep Models via Neural Collapse [37.483109067209504]
This work investigates the relationship between neural collapse (NC) and transfer learning for classification problems.
We find strong correlation between feature collapse and downstream performance.
Our proposed fine-tuning methods deliver good performances while reducing fine-tuning parameters by at least 90%.
arXiv Detail & Related papers (2022-12-23T08:48:34Z) - Reconciliation of Pre-trained Models and Prototypical Neural Networks in
Few-shot Named Entity Recognition [35.34238362639678]
We propose a one-line-code normalization method to reconcile such a mismatch with empirical and theoretical grounds.
Our work also provides an analytical viewpoint for addressing the general problems in few-shot name entity recognition.
arXiv Detail & Related papers (2022-11-07T02:33:45Z) - Memorization-Dilation: Modeling Neural Collapse Under Label Noise [10.134749691813344]
During the terminal phase of training a deep neural network, the feature embedding of all examples of the same class tend to collapse to a single representation.
Empirical evidence suggests that the memorization of noisy data points leads to a degradation (dilation) of the neural collapse.
Our proofs reveal why label smoothing, a modification of cross-entropy empirically observed to produce a regularization effect, leads to improved generalization in classification tasks.
arXiv Detail & Related papers (2022-06-11T13:40:37Z) - On the Generalization and Adaption Performance of Causal Models [99.64022680811281]
Differentiable causal discovery has proposed to factorize the data generating process into a set of modules.
We study the generalization and adaption performance of such modular neural causal models.
Our analysis shows that the modular neural causal models outperform other models on both zero and few-shot adaptation in low data regimes.
arXiv Detail & Related papers (2022-06-09T17:12:32Z) - Multi-loss ensemble deep learning for chest X-ray classification [0.8594140167290096]
Class imbalance is common in medical image classification tasks, where the number of abnormal samples is fewer than the number of normal samples.
We propose novel loss functions to train a DL model and analyze its performance in a multiclass classification setting.
arXiv Detail & Related papers (2021-09-29T14:14:04Z) - On the Transferability of Adversarial Attacksagainst Neural Text
Classifier [121.6758865857686]
We investigate the transferability of adversarial examples for text classification models.
We propose a genetic algorithm to find an ensemble of models that can induce adversarial examples to fool almost all existing models.
We derive word replacement rules that can be used for model diagnostics from these adversarial examples.
arXiv Detail & Related papers (2020-11-17T10:45:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.