Uniform Priors for Data-Efficient Transfer
- URL: http://arxiv.org/abs/2006.16524v2
- Date: Tue, 13 Oct 2020 12:21:07 GMT
- Title: Uniform Priors for Data-Efficient Transfer
- Authors: Samarth Sinha, Karsten Roth, Anirudh Goyal, Marzyeh Ghassemi, Hugo
Larochelle, Animesh Garg
- Abstract summary: We show that features that are most transferable have high uniformity in the embedding space.
We evaluate the regularization on its ability to facilitate adaptation to unseen tasks and data.
- Score: 65.086680950871
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep Neural Networks have shown great promise on a variety of downstream
applications; but their ability to adapt and generalize to new data and tasks
remains a challenge. However, the ability to perform few or zero-shot
adaptation to novel tasks is important for the scalability and deployment of
machine learning models. It is therefore crucial to understand what makes for
good, transfer-able features in deep networks that best allow for such
adaptation. In this paper, we shed light on this by showing that features that
are most transferable have high uniformity in the embedding space and propose a
uniformity regularization scheme that encourages better transfer and feature
reuse. We evaluate the regularization on its ability to facilitate adaptation
to unseen tasks and data, for which we conduct a thorough experimental study
covering four relevant, and distinct domains: few-shot Meta-Learning, Deep
Metric Learning, Zero-Shot Domain Adaptation, as well as Out-of-Distribution
classification. Across all experiments, we show that uniformity regularization
consistently offers benefits over baseline methods and is able to achieve
state-of-the-art performance in Deep Metric Learning and Meta-Learning.
Related papers
- Features are fate: a theory of transfer learning in high-dimensional regression [23.840251319669907]
We show that when the target task is well represented by the feature space of the pre-trained model, transfer learning outperforms training from scratch.
For this model, we establish rigorously that when the feature space overlap between the source and target tasks is sufficiently strong, both linear transfer and fine-tuning improve performance.
arXiv Detail & Related papers (2024-10-10T17:58:26Z) - Optimal transfer protocol by incremental layer defrosting [66.76153955485584]
Transfer learning is a powerful tool enabling model training with limited amounts of data.
The simplest transfer learning protocol is based on freezing" the feature-extractor layers of a network pre-trained on a data-rich source task.
We show that this protocol is often sub-optimal and the largest performance gain may be achieved when smaller portions of the pre-trained network are kept frozen.
arXiv Detail & Related papers (2023-03-02T17:32:11Z) - Transferability Estimation Based On Principal Gradient Expectation [68.97403769157117]
Cross-task transferability is compatible with transferred results while keeping self-consistency.
Existing transferability metrics are estimated on the particular model by conversing source and target tasks.
We propose Principal Gradient Expectation (PGE), a simple yet effective method for assessing transferability across tasks.
arXiv Detail & Related papers (2022-11-29T15:33:02Z) - Omni-Training for Data-Efficient Deep Learning [80.28715182095975]
Recent advances reveal that a properly pre-trained model endows an important property: transferability.
A tight combination of pre-training and meta-training cannot achieve both kinds of transferability.
This motivates the proposed Omni-Training framework towards data-efficient deep learning.
arXiv Detail & Related papers (2021-10-14T16:30:36Z) - Meta-Learning with Variational Bayes [0.0]
We introduce a new approach to address the more general problem of generative meta-learning.
Our contribution leverages the AEVB framework and mean-field variational Bayes, and creates fast-adapting latent-space generative models.
At the heart of our contribution is a new result, showing that for a broad class of deep generative latent variable models, the relevant VB updates do not depend on any generative neural network.
arXiv Detail & Related papers (2021-03-03T09:02:01Z) - Exploring Complementary Strengths of Invariant and Equivariant
Representations for Few-Shot Learning [96.75889543560497]
In many real-world problems, collecting a large number of labeled samples is infeasible.
Few-shot learning is the dominant approach to address this issue, where the objective is to quickly adapt to novel categories in presence of a limited number of samples.
We propose a novel training mechanism that simultaneously enforces equivariance and invariance to a general set of geometric transformations.
arXiv Detail & Related papers (2021-03-01T21:14:33Z) - ProtoDA: Efficient Transfer Learning for Few-Shot Intent Classification [21.933876113300897]
We adopt an alternative approach by transfer learning on an ensemble of related tasks using prototypical networks under the meta-learning paradigm.
Using intent classification as a case study, we demonstrate that increasing variability in training tasks can significantly improve classification performance.
arXiv Detail & Related papers (2021-01-28T00:19:13Z) - Meta-learning Transferable Representations with a Single Target Domain [46.83481356352768]
Fine-tuning and joint training do not always improve accuracy on downstream tasks.
We propose Meta Representation Learning (MeRLin) to learn transferable features.
MeRLin empirically outperforms previous state-of-the-art transfer learning algorithms on various real-world vision and NLP transfer learning benchmarks.
arXiv Detail & Related papers (2020-11-03T01:57:37Z) - Towards Accurate Knowledge Transfer via Target-awareness Representation
Disentanglement [56.40587594647692]
We propose a novel transfer learning algorithm, introducing the idea of Target-awareness REpresentation Disentanglement (TRED)
TRED disentangles the relevant knowledge with respect to the target task from the original source model and used as a regularizer during fine-tuning the target model.
Experiments on various real world datasets show that our method stably improves the standard fine-tuning by more than 2% in average.
arXiv Detail & Related papers (2020-10-16T17:45:08Z) - Inter- and Intra-domain Knowledge Transfer for Related Tasks in Deep
Character Recognition [2.320417845168326]
Pre-training a deep neural network on the ImageNet dataset is a common practice for training deep learning models.
The technique of pre-training on one task and then retraining on a new one is called transfer learning.
In this paper we analyse the effectiveness of using deep transfer learning for character recognition tasks.
arXiv Detail & Related papers (2020-01-02T14:18:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.