Geometry-Aware Adaptation for Pretrained Models
- URL: http://arxiv.org/abs/2307.12226v2
- Date: Tue, 28 Nov 2023 04:35:51 GMT
- Title: Geometry-Aware Adaptation for Pretrained Models
- Authors: Nicholas Roberts, Xintong Li, Dyah Adila, Sonia Cromp, Tzu-Heng Huang,
Jitian Zhao, Frederic Sala
- Abstract summary: We propose a drop-in replacement of the standard prediction rule, swapping argmax with the Fr'echet mean.
Our proposed approach, Loki, gains up to 29.7% relative improvement over SimCLR on ImageNet.
When no such metric is available, Loki can use self-derived metrics from class embeddings and obtains a 10.5% improvement on pretrained zero-shot models.
- Score: 15.715395029966812
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Machine learning models -- including prominent zero-shot models -- are often
trained on datasets whose labels are only a small proportion of a larger label
space. Such spaces are commonly equipped with a metric that relates the labels
via distances between them. We propose a simple approach to exploit this
information to adapt the trained model to reliably predict new classes -- or,
in the case of zero-shot prediction, to improve its performance -- without any
additional training. Our technique is a drop-in replacement of the standard
prediction rule, swapping argmax with the Fr\'echet mean. We provide a
comprehensive theoretical analysis for this approach, studying (i)
learning-theoretic results trading off label space diameter, sample complexity,
and model dimension, (ii) characterizations of the full range of scenarios in
which it is possible to predict any unobserved class, and (iii) an optimal
active learning-like next class selection procedure to obtain optimal training
classes for when it is not possible to predict the entire range of unobserved
classes. Empirically, using easily-available external metrics, our proposed
approach, Loki, gains up to 29.7% relative improvement over SimCLR on ImageNet
and scales to hundreds of thousands of classes. When no such metric is
available, Loki can use self-derived metrics from class embeddings and obtains
a 10.5% improvement on pretrained zero-shot models such as CLIP.
Related papers
- Probably Approximately Precision and Recall Learning [62.912015491907994]
Precision and Recall are foundational metrics in machine learning.
One-sided feedback--where only positive examples are observed during training--is inherent in many practical problems.
We introduce a PAC learning framework where each hypothesis is represented by a graph, with edges indicating positive interactions.
arXiv Detail & Related papers (2024-11-20T04:21:07Z) - Towards An Online Incremental Approach to Predict Students Performance [0.8287206589886879]
We propose a memory-based online incremental learning approach for updating an online classifier.
Our approach achieves a notable improvement in model accuracy, with an enhancement of nearly 10% compared to the current state-of-the-art.
arXiv Detail & Related papers (2024-05-03T17:13:26Z) - Rethinking Classifier Re-Training in Long-Tailed Recognition: A Simple
Logits Retargeting Approach [102.0769560460338]
We develop a simple logits approach (LORT) without the requirement of prior knowledge of the number of samples per class.
Our method achieves state-of-the-art performance on various imbalanced datasets, including CIFAR100-LT, ImageNet-LT, and iNaturalist 2018.
arXiv Detail & Related papers (2024-03-01T03:27:08Z) - Uncertainty-aware Sampling for Long-tailed Semi-supervised Learning [89.98353600316285]
We introduce uncertainty into the modeling process for pseudo-label sampling, taking into account that the model performance on the tailed classes varies over different training stages.
This approach allows the model to perceive the uncertainty of pseudo-labels at different training stages, thereby adaptively adjusting the selection thresholds for different classes.
Compared to other methods such as the baseline method FixMatch, UDTS achieves an increase in accuracy of at least approximately 5.26%, 1.75%, 9.96%, and 1.28% on the natural scene image datasets.
arXiv Detail & Related papers (2024-01-09T08:59:39Z) - Continual Learning in Open-vocabulary Classification with Complementary Memory Systems [19.337633598158778]
We introduce a method for flexible and efficient continual learning in open-vocabulary image classification.
We combine predictions from a CLIP zero-shot model and the exemplar-based model, using the zero-shot estimated probability that a sample's class is within the exemplar classes.
We also propose a "tree probe" method, an adaption of lazy learning principles, which enables fast learning from new examples with competitive accuracy to batch-trained linear models.
arXiv Detail & Related papers (2023-07-04T01:47:34Z) - CLIPood: Generalizing CLIP to Out-of-Distributions [73.86353105017076]
Contrastive language-image pre-training (CLIP) models have shown impressive zero-shot ability, but the further adaptation of CLIP on downstream tasks undesirably degrades OOD performances.
We propose CLIPood, a fine-tuning method that can adapt CLIP models to OOD situations where both domain shifts and open classes may occur on unseen test data.
Experiments on diverse datasets with different OOD scenarios show that CLIPood consistently outperforms existing generalization techniques.
arXiv Detail & Related papers (2023-02-02T04:27:54Z) - GMM-IL: Image Classification using Incrementally Learnt, Independent
Probabilistic Models for Small Sample Sizes [0.4511923587827301]
We present a novel two stage architecture which couples visual feature learning with probabilistic models to represent each class.
We outperform a benchmark of an equivalent network with a Softmax head, obtaining increased accuracy for sample sizes smaller than 12 and increased weighted F1 score for 3 imbalanced class profiles.
arXiv Detail & Related papers (2022-12-01T15:19:42Z) - Self-Distillation for Further Pre-training of Transformers [83.84227016847096]
We propose self-distillation as a regularization for a further pre-training stage.
We empirically validate the efficacy of self-distillation on a variety of benchmark datasets for image and text classification tasks.
arXiv Detail & Related papers (2022-09-30T02:25:12Z) - Lightweight Conditional Model Extrapolation for Streaming Data under
Class-Prior Shift [27.806085423595334]
We introduce LIMES, a new method for learning with non-stationary streaming data.
We learn a single set of model parameters from which a specific classifier for any specific data distribution is derived.
Experiments on a set of exemplary tasks using Twitter data show that LIMES achieves higher accuracy than alternative approaches.
arXiv Detail & Related papers (2022-06-10T15:19:52Z) - Uncertainty-aware Self-training for Text Classification with Few Labels [54.13279574908808]
We study self-training as one of the earliest semi-supervised learning approaches to reduce the annotation bottleneck.
We propose an approach to improve self-training by incorporating uncertainty estimates of the underlying neural network.
We show our methods leveraging only 20-30 labeled samples per class for each task for training and for validation can perform within 3% of fully supervised pre-trained language models.
arXiv Detail & Related papers (2020-06-27T08:13:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.