Efficient Equivariant Transfer Learning from Pretrained Models
- URL: http://arxiv.org/abs/2305.09900v2
- Date: Tue, 10 Oct 2023 19:01:41 GMT
- Title: Efficient Equivariant Transfer Learning from Pretrained Models
- Authors: Sourya Basu, Pulkit Katdare, Prasanna Sattigeri, Vijil
Chenthamarakshan, Katherine Driggs-Campbell, Payel Das, Lav R. Varshney
- Abstract summary: We show that lambda-equitune averages the features using importance weights, lambdas.
These weights are learned directly from the data using a small neural network.
We prove that lambda-equitune is equivariant and a universal approximator of equivariant functions.
- Score: 45.918447685383356
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Efficient transfer learning algorithms are key to the success of foundation
models on diverse downstream tasks even with limited data. Recent works of Basu
et al. (2023) and Kaba et al. (2022) propose group averaging (equitune) and
optimization-based methods, respectively, over features from group-transformed
inputs to obtain equivariant outputs from non-equivariant neural networks.
While Kaba et al. (2022) are only concerned with training from scratch, we find
that equitune performs poorly on equivariant zero-shot tasks despite good
finetuning results. We hypothesize that this is because pretrained models
provide better quality features for certain transformations than others and
simply averaging them is deleterious. Hence, we propose {\lambda}-equitune that
averages the features using importance weights, {\lambda}s. These weights are
learned directly from the data using a small neural network, leading to
excellent zero-shot and finetuned results that outperform equitune. Further, we
prove that {\lambda}-equitune is equivariant and a universal approximator of
equivariant functions. Additionally, we show that the method of Kaba et al.
(2022) used with appropriate loss functions, which we call equizero, also gives
excellent zero-shot and finetuned performance. Both equitune and equizero are
special cases of {\lambda}-equitune. To show the simplicity and generality of
our method, we validate on a wide range of diverse applications and models such
as 1) image classification using CLIP, 2) deep Q-learning, 3) fairness in
natural language generation (NLG), 4) compositional generalization in
languages, and 5) image classification using pretrained CNNs such as Resnet and
Alexnet.
Related papers
- Rethinking Classifier Re-Training in Long-Tailed Recognition: A Simple
Logits Retargeting Approach [102.0769560460338]
We develop a simple logits approach (LORT) without the requirement of prior knowledge of the number of samples per class.
Our method achieves state-of-the-art performance on various imbalanced datasets, including CIFAR100-LT, ImageNet-LT, and iNaturalist 2018.
arXiv Detail & Related papers (2024-03-01T03:27:08Z) - Class-Imbalanced Semi-Supervised Learning for Large-Scale Point Cloud
Semantic Segmentation via Decoupling Optimization [64.36097398869774]
Semi-supervised learning (SSL) has been an active research topic for large-scale 3D scene understanding.
The existing SSL-based methods suffer from severe training bias due to class imbalance and long-tail distributions of the point cloud data.
We introduce a new decoupling optimization framework, which disentangles feature representation learning and classifier in an alternative optimization manner to shift the bias decision boundary effectively.
arXiv Detail & Related papers (2024-01-13T04:16:40Z) - Adaptive manifold for imbalanced transductive few-shot learning [16.627512688664513]
We propose a novel algorithm to address imbalanced transductive few-shot learning, named Adaptive Manifold.
Our method exploits the underlying manifold of the labeled support examples and unlabeled queries by using manifold similarity to predict the class probability distribution per query.
arXiv Detail & Related papers (2023-04-27T15:42:49Z) - Equivariance with Learned Canonicalization Functions [77.32483958400282]
We show that learning a small neural network to perform canonicalization is better than using predefineds.
Our experiments show that learning the canonicalization function is competitive with existing techniques for learning equivariant functions across many tasks.
arXiv Detail & Related papers (2022-11-11T21:58:15Z) - Equi-Tuning: Group Equivariant Fine-Tuning of Pretrained Models [56.88106830869487]
We introduce equi-tuning, a novel fine-tuning method that transforms (potentially non-equivariant) pretrained models into group equivariant models.
We provide applications of equi-tuning on three different tasks: image classification, compositional generalization in language, and fairness in natural language generation.
arXiv Detail & Related papers (2022-10-13T08:45:23Z) - Improving Pre-trained Language Model Fine-tuning with Noise Stability
Regularization [94.4409074435894]
We propose a novel and effective fine-tuning framework, named Layerwise Noise Stability Regularization (LNSR)
Specifically, we propose to inject the standard Gaussian noise and regularize hidden representations of the fine-tuned model.
We demonstrate the advantages of the proposed method over other state-of-the-art algorithms including L2-SP, Mixout and SMART.
arXiv Detail & Related papers (2022-06-12T04:42:49Z) - MIO : Mutual Information Optimization using Self-Supervised Binary
Contrastive Learning [19.5917119072985]
We model contrastive learning into a binary classification problem to predict if a pair is positive or not.
The proposed method outperforms the state-of-the-art algorithms on benchmark datasets like STL-10, CIFAR-10, CIFAR-100.
arXiv Detail & Related papers (2021-11-24T17:51:29Z) - eGAN: Unsupervised approach to class imbalance using transfer learning [8.100450025624443]
Class imbalance is an inherent problem in many machine learning classification tasks.
We explore an unsupervised approach to address these imbalances by leveraging transfer learning from pre-trained image classification models to encoder-based Generative Adversarial Network (eGAN)
Best result of 0.69 F1-score was obtained on CIFAR-10 classification task with imbalance ratio of 1:2500.
arXiv Detail & Related papers (2021-04-09T02:37:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.