On Transfer in Classification: How Well do Subsets of Classes
Generalize?
- URL: http://arxiv.org/abs/2403.03569v1
- Date: Wed, 6 Mar 2024 09:25:22 GMT
- Title: On Transfer in Classification: How Well do Subsets of Classes
Generalize?
- Authors: Raphael Baena, Lucas Drumetz, Vincent Gripon
- Abstract summary: In classification, it is usual to observe that models trained on a given set of classes can generalize to previously unseen ones.
This ability is often leveraged in the context of transfer learning where a pretrained model can be used to process new classes.
In this work, we are interested in laying the foundations of such a theoretical framework for transferability between sets of classes.
- Score: 6.38421840998693
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: In classification, it is usual to observe that models trained on a given set
of classes can generalize to previously unseen ones, suggesting the ability to
learn beyond the initial task. This ability is often leveraged in the context
of transfer learning where a pretrained model can be used to process new
classes, with or without fine tuning. Surprisingly, there are a few papers
looking at the theoretical roots beyond this phenomenon. In this work, we are
interested in laying the foundations of such a theoretical framework for
transferability between sets of classes. Namely, we establish a partially
ordered set of subsets of classes. This tool allows to represent which subset
of classes can generalize to others. In a more practical setting, we explore
the ability of our framework to predict which subset of classes can lead to the
best performance when testing on all of them. We also explore few-shot
learning, where transfer is the golden standard. Our work contributes to better
understanding of transfer mechanics and model generalization.
Related papers
- Task Confusion and Catastrophic Forgetting in Class-Incremental Learning: A Mathematical Framework for Discriminative and Generative Modelings [5.899701834228992]
In class-incremental learning (class-IL), models must classify all previously seen classes at test time without task-IDs, leading to task confusion.
We present a novel mathematical framework for class-IL and prove the Infeasibility Theorem, showing optimal class-IL is impossible with discriminative modeling.
arXiv Detail & Related papers (2024-10-28T06:08:38Z) - A separability-based approach to quantifying generalization: which layer is best? [0.0]
Generalization to unseen data remains poorly understood for deep learning classification and foundation models.
We provide a new method for evaluating the capacity of networks to represent a sampled domain.
We find that (i) high classification accuracy does not imply high generalizability; and (ii) deeper layers in a model do not always generalize the best.
arXiv Detail & Related papers (2024-05-02T17:54:35Z) - Few-Shot Class-Incremental Learning via Training-Free Prototype
Calibration [67.69532794049445]
We find a tendency for existing methods to misclassify the samples of new classes into base classes, which leads to the poor performance of new classes.
We propose a simple yet effective Training-frEE calibratioN (TEEN) strategy to enhance the discriminability of new classes.
arXiv Detail & Related papers (2023-12-08T18:24:08Z) - Generalization Bounds for Few-Shot Transfer Learning with Pretrained
Classifiers [26.844410679685424]
We study the ability of foundation models to learn representations for classification that are transferable to new, unseen classes.
We show that the few-shot error of the learned feature map on new classes is small in case of class-feature-variability collapse.
arXiv Detail & Related papers (2022-12-23T18:46:05Z) - Do Deep Networks Transfer Invariances Across Classes? [123.84237389985236]
We show how a generative approach for learning the nuisance transformations can help transfer invariances across classes.
Our results provide one explanation for why classifiers generalize poorly on unbalanced and longtailed distributions.
arXiv Detail & Related papers (2022-03-18T04:38:18Z) - Long-tail Recognition via Compositional Knowledge Transfer [60.03764547406601]
We introduce a novel strategy for long-tail recognition that addresses the tail classes' few-shot problem.
Our objective is to transfer knowledge acquired from information-rich common classes to semantically similar, and yet data-hungry, rare classes.
Experiments show that our approach can achieve significant performance boosts on rare classes while maintaining robust common class performance.
arXiv Detail & Related papers (2021-12-13T15:48:59Z) - Affinity-Based Hierarchical Learning of Dependent Concepts for Human
Activity Recognition [6.187780920448871]
We show that the organization of overlapping classes into hierarchies considerably improves classification performances.
This is particularly true in the case of activity recognition tasks featured in the SHL dataset.
We propose an approach based on transfer affinity among the classes to determine an optimal hierarchy for the learning process.
arXiv Detail & Related papers (2021-04-11T01:08:48Z) - Partial Is Better Than All: Revisiting Fine-tuning Strategy for Few-shot
Learning [76.98364915566292]
A common practice is to train a model on the base set first and then transfer to novel classes through fine-tuning.
We propose to transfer partial knowledge by freezing or fine-tuning particular layer(s) in the base model.
We conduct extensive experiments on CUB and mini-ImageNet to demonstrate the effectiveness of our proposed method.
arXiv Detail & Related papers (2021-02-08T03:27:05Z) - CLASTER: Clustering with Reinforcement Learning for Zero-Shot Action
Recognition [52.66360172784038]
We propose a clustering-based model, which considers all training samples at once, instead of optimizing for each instance individually.
We call the proposed method CLASTER and observe that it consistently improves over the state-of-the-art in all standard datasets.
arXiv Detail & Related papers (2021-01-18T12:46:24Z) - Subclass Distillation [94.18870689772544]
We show that it is possible to transfer most of the generalization ability of a teacher to a student.
For datasets where there are known, natural subclasses we demonstrate that the teacher learns similar subclasses.
For clickthrough datasets where the subclasses are unknown we demonstrate that subclass distillation allows the student to learn faster and better.
arXiv Detail & Related papers (2020-02-10T16:45:30Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.