Related papers: Meta-learning Transferable Representations with a Single Target Domain

Meta-learning Transferable Representations with a Single Target Domain

URL: http://arxiv.org/abs/2011.01418v1
Date: Tue, 3 Nov 2020 01:57:37 GMT
Title: Meta-learning Transferable Representations with a Single Target Domain
Authors: Hong Liu, Jeff Z. HaoChen, Colin Wei, Tengyu Ma
Abstract summary: Fine-tuning and joint training do not always improve accuracy on downstream tasks. We propose Meta Representation Learning (MeRLin) to learn transferable features. MeRLin empirically outperforms previous state-of-the-art transfer learning algorithms on various real-world vision and NLP transfer learning benchmarks.
Score: 46.83481356352768
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Recent works found that fine-tuning and joint training---two popular approaches for transfer learning---do not always improve accuracy on downstream tasks. First, we aim to understand more about when and why fine-tuning and joint training can be suboptimal or even harmful for transfer learning. We design semi-synthetic datasets where the source task can be solved by either source-specific features or transferable features. We observe that (1) pre-training may not have incentive to learn transferable features and (2) joint training may simultaneously learn source-specific features and overfit to the target. Second, to improve over fine-tuning and joint training, we propose Meta Representation Learning (MeRLin) to learn transferable features. MeRLin meta-learns representations by ensuring that a head fit on top of the representations with target training data also performs well on target validation data. We also prove that MeRLin recovers the target ground-truth model with a quadratic neural net parameterization and a source distribution that contains both transferable and source-specific features. On the same distribution, pre-training and joint training provably fail to learn transferable features. MeRLin empirically outperforms previous state-of-the-art transfer learning algorithms on various real-world vision and NLP transfer learning benchmarks.

Related papers

Multi-Stage Knowledge Integration of Vision-Language Models for Continual Learning [79.46570165281084]
We propose a Multi-Stage Knowledge Integration network (MulKI) to emulate the human learning process in distillation methods. MulKI achieves this through four stages, including Eliciting Ideas, Adding New Ideas, Distinguishing Ideas, and Making Connections. Our method demonstrates significant improvements in maintaining zero-shot capabilities while supporting continual learning across diverse downstream tasks.
arXiv Detail & Related papers (2024-11-11T07:36:19Z)
Model-Based Reinforcement Learning with Multi-Task Offline Pretraining [59.82457030180094]
We present a model-based RL method that learns to transfer potentially useful dynamics and action demonstrations from offline data to a novel task. The main idea is to use the world models not only as simulators for behavior learning but also as tools to measure the task relevance. We demonstrate the advantages of our approach compared with the state-of-the-art methods in Meta-World and DeepMind Control Suite.
arXiv Detail & Related papers (2023-06-06T02:24:41Z)
Omni-Training for Data-Efficient Deep Learning [80.28715182095975]
Recent advances reveal that a properly pre-trained model endows an important property: transferability. A tight combination of pre-training and meta-training cannot achieve both kinds of transferability. This motivates the proposed Omni-Training framework towards data-efficient deep learning.
arXiv Detail & Related papers (2021-10-14T16:30:36Z)
Adversarial Training Helps Transfer Learning via Better Representations [17.497590668804055]
Transfer learning aims to leverage models pre-trained on source data to efficiently adapt to target setting. Recent works empirically demonstrate that adversarial training in the source data can improve the ability of models to transfer to new domains. We show that adversarial training in the source data generates provably better representations, so fine-tuning on top of this representation leads to a more accurate predictor of the target data.
arXiv Detail & Related papers (2021-06-18T15:41:07Z)
The Common Intuition to Transfer Learning Can Win or Lose: Case Studies for Linear Regression [26.5147705530439]
We define a transfer learning approach to the target task as a linear regression optimization with a regularization on the distance between the to-be-learned target parameters and the already-learned source parameters. We show that for sufficiently related tasks, the optimally tuned transfer learning approach can outperform the optimally tuned ridge regression method.
arXiv Detail & Related papers (2021-03-09T18:46:01Z)
Towards Accurate Knowledge Transfer via Target-awareness Representation Disentanglement [56.40587594647692]
We propose a novel transfer learning algorithm, introducing the idea of Target-awareness REpresentation Disentanglement (TRED) TRED disentangles the relevant knowledge with respect to the target task from the original source model and used as a regularizer during fine-tuning the target model. Experiments on various real world datasets show that our method stably improves the standard fine-tuning by more than 2% in average.
arXiv Detail & Related papers (2020-10-16T17:45:08Z)
Uniform Priors for Data-Efficient Transfer [65.086680950871]
We show that features that are most transferable have high uniformity in the embedding space. We evaluate the regularization on its ability to facilitate adaptation to unseen tasks and data.
arXiv Detail & Related papers (2020-06-30T04:39:36Z)
Minimax Lower Bounds for Transfer Learning with Linear and One-hidden Layer Neural Networks [27.44348371795822]
We develop a statistical minimax framework to characterize the limits of transfer learning. We derive a lower-bound for the target generalization error achievable by any algorithm as a function of the number of labeled source and target data.
arXiv Detail & Related papers (2020-06-16T22:49:26Z)
Inter- and Intra-domain Knowledge Transfer for Related Tasks in Deep Character Recognition [2.320417845168326]
Pre-training a deep neural network on the ImageNet dataset is a common practice for training deep learning models. The technique of pre-training on one task and then retraining on a new one is called transfer learning. In this paper we analyse the effectiveness of using deep transfer learning for character recognition tasks.
arXiv Detail & Related papers (2020-01-02T14:18:25Z)

This list is automatically generated from the titles and abstracts of the papers in this site.