Meta-learning Transferable Representations with a Single Target Domain
- URL: http://arxiv.org/abs/2011.01418v1
- Date: Tue, 3 Nov 2020 01:57:37 GMT
- Title: Meta-learning Transferable Representations with a Single Target Domain
- Authors: Hong Liu, Jeff Z. HaoChen, Colin Wei, Tengyu Ma
- Abstract summary: Fine-tuning and joint training do not always improve accuracy on downstream tasks.
We propose Meta Representation Learning (MeRLin) to learn transferable features.
MeRLin empirically outperforms previous state-of-the-art transfer learning algorithms on various real-world vision and NLP transfer learning benchmarks.
- Score: 46.83481356352768
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent works found that fine-tuning and joint training---two popular
approaches for transfer learning---do not always improve accuracy on downstream
tasks. First, we aim to understand more about when and why fine-tuning and
joint training can be suboptimal or even harmful for transfer learning. We
design semi-synthetic datasets where the source task can be solved by either
source-specific features or transferable features. We observe that (1)
pre-training may not have incentive to learn transferable features and (2)
joint training may simultaneously learn source-specific features and overfit to
the target. Second, to improve over fine-tuning and joint training, we propose
Meta Representation Learning (MeRLin) to learn transferable features. MeRLin
meta-learns representations by ensuring that a head fit on top of the
representations with target training data also performs well on target
validation data. We also prove that MeRLin recovers the target ground-truth
model with a quadratic neural net parameterization and a source distribution
that contains both transferable and source-specific features. On the same
distribution, pre-training and joint training provably fail to learn
transferable features. MeRLin empirically outperforms previous state-of-the-art
transfer learning algorithms on various real-world vision and NLP transfer
learning benchmarks.
Related papers
- Multi-Stage Knowledge Integration of Vision-Language Models for Continual Learning [79.46570165281084]
We propose a Multi-Stage Knowledge Integration network (MulKI) to emulate the human learning process in distillation methods.
MulKI achieves this through four stages, including Eliciting Ideas, Adding New Ideas, Distinguishing Ideas, and Making Connections.
Our method demonstrates significant improvements in maintaining zero-shot capabilities while supporting continual learning across diverse downstream tasks.
arXiv Detail & Related papers (2024-11-11T07:36:19Z) - Model-Based Reinforcement Learning with Multi-Task Offline Pretraining [59.82457030180094]
We present a model-based RL method that learns to transfer potentially useful dynamics and action demonstrations from offline data to a novel task.
The main idea is to use the world models not only as simulators for behavior learning but also as tools to measure the task relevance.
We demonstrate the advantages of our approach compared with the state-of-the-art methods in Meta-World and DeepMind Control Suite.
arXiv Detail & Related papers (2023-06-06T02:24:41Z) - Omni-Training for Data-Efficient Deep Learning [80.28715182095975]
Recent advances reveal that a properly pre-trained model endows an important property: transferability.
A tight combination of pre-training and meta-training cannot achieve both kinds of transferability.
This motivates the proposed Omni-Training framework towards data-efficient deep learning.
arXiv Detail & Related papers (2021-10-14T16:30:36Z) - Adversarial Training Helps Transfer Learning via Better Representations [17.497590668804055]
Transfer learning aims to leverage models pre-trained on source data to efficiently adapt to target setting.
Recent works empirically demonstrate that adversarial training in the source data can improve the ability of models to transfer to new domains.
We show that adversarial training in the source data generates provably better representations, so fine-tuning on top of this representation leads to a more accurate predictor of the target data.
arXiv Detail & Related papers (2021-06-18T15:41:07Z) - The Common Intuition to Transfer Learning Can Win or Lose: Case Studies for Linear Regression [26.5147705530439]
We define a transfer learning approach to the target task as a linear regression optimization with a regularization on the distance between the to-be-learned target parameters and the already-learned source parameters.
We show that for sufficiently related tasks, the optimally tuned transfer learning approach can outperform the optimally tuned ridge regression method.
arXiv Detail & Related papers (2021-03-09T18:46:01Z) - Towards Accurate Knowledge Transfer via Target-awareness Representation
Disentanglement [56.40587594647692]
We propose a novel transfer learning algorithm, introducing the idea of Target-awareness REpresentation Disentanglement (TRED)
TRED disentangles the relevant knowledge with respect to the target task from the original source model and used as a regularizer during fine-tuning the target model.
Experiments on various real world datasets show that our method stably improves the standard fine-tuning by more than 2% in average.
arXiv Detail & Related papers (2020-10-16T17:45:08Z) - Uniform Priors for Data-Efficient Transfer [65.086680950871]
We show that features that are most transferable have high uniformity in the embedding space.
We evaluate the regularization on its ability to facilitate adaptation to unseen tasks and data.
arXiv Detail & Related papers (2020-06-30T04:39:36Z) - Minimax Lower Bounds for Transfer Learning with Linear and One-hidden
Layer Neural Networks [27.44348371795822]
We develop a statistical minimax framework to characterize the limits of transfer learning.
We derive a lower-bound for the target generalization error achievable by any algorithm as a function of the number of labeled source and target data.
arXiv Detail & Related papers (2020-06-16T22:49:26Z) - Inter- and Intra-domain Knowledge Transfer for Related Tasks in Deep
Character Recognition [2.320417845168326]
Pre-training a deep neural network on the ImageNet dataset is a common practice for training deep learning models.
The technique of pre-training on one task and then retraining on a new one is called transfer learning.
In this paper we analyse the effectiveness of using deep transfer learning for character recognition tasks.
arXiv Detail & Related papers (2020-01-02T14:18:25Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.