Omni-Training for Data-Efficient Deep Learning
- URL: http://arxiv.org/abs/2110.07510v1
- Date: Thu, 14 Oct 2021 16:30:36 GMT
- Title: Omni-Training for Data-Efficient Deep Learning
- Authors: Yang Shu, Zhangjie Cao, Jinghan Gao, Jianmin Wang, Mingsheng Long
- Abstract summary: Recent advances reveal that a properly pre-trained model endows an important property: transferability.
A tight combination of pre-training and meta-training cannot achieve both kinds of transferability.
This motivates the proposed Omni-Training framework towards data-efficient deep learning.
- Score: 80.28715182095975
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Learning a generalizable deep model from a few examples in a short time
remains a major challenge of machine learning, which has impeded its wide
deployment to many scenarios. Recent advances reveal that a properly
pre-trained model endows an important property: transferability. A higher
transferability of the learned representations indicates a better
generalizability across domains of different distributions (domain
transferability), or across tasks of different semantics (task
transferability). Transferability has become the key to enable data-efficient
deep learning, however, existing pre-training methods focus only on the domain
transferability while meta-training methods only on the task transferability.
This restricts their data-efficiency in downstream scenarios of diverging
domains and tasks. A finding of this paper is that even a tight combination of
pre-training and meta-training cannot achieve both kinds of transferability.
This motivates the proposed Omni-Training framework towards data-efficient deep
learning. Our first contribution is Omni-Net, a tri-flow architecture. Besides
the joint representation flow, Omni-Net introduces two new parallel flows for
pre-training and meta-training, respectively responsible for learning
representations of domain transferability and task transferability. Omni-Net
coordinates the parallel flows by routing them via the joint-flow, making each
gain the other kind of transferability. Our second contribution is Omni-Loss,
in which a mean-teacher regularization is imposed to learn generalizable and
stabilized representations. Omni-Training is a general framework that
accommodates many existing pre-training and meta-training algorithms. A
thorough evaluation on cross-task and cross-domain datasets in classification,
regression and reinforcement learning problems shows that Omni-Training
consistently outperforms the state-of-the-art methods.
Related papers
- Efficient Transfer Learning for Video-language Foundation Models [13.166348605993292]
We propose a simple yet effective Multi-modal Spatio-supervised (MSTA) to improve the alignment between representations in the text and vision branches.
We evaluate the effectiveness of our approach across four tasks: zero-shot transfer, few-shot learning, base-to-valiant, and fully-language learning.
arXiv Detail & Related papers (2024-11-18T01:25:58Z) - Model-Based Reinforcement Learning with Multi-Task Offline Pretraining [59.82457030180094]
We present a model-based RL method that learns to transfer potentially useful dynamics and action demonstrations from offline data to a novel task.
The main idea is to use the world models not only as simulators for behavior learning but also as tools to measure the task relevance.
We demonstrate the advantages of our approach compared with the state-of-the-art methods in Meta-World and DeepMind Control Suite.
arXiv Detail & Related papers (2023-06-06T02:24:41Z) - Effective Adaptation in Multi-Task Co-Training for Unified Autonomous
Driving [103.745551954983]
In this paper, we investigate the transfer performance of various types of self-supervised methods, including MoCo and SimCLR, on three downstream tasks.
We find that their performances are sub-optimal or even lag far behind the single-task baseline.
We propose a simple yet effective pretrain-adapt-finetune paradigm for general multi-task training.
arXiv Detail & Related papers (2022-09-19T12:15:31Z) - Beyond Transfer Learning: Co-finetuning for Action Localisation [64.07196901012153]
We propose co-finetuning -- simultaneously training a single model on multiple upstream'' and downstream'' tasks.
We demonstrate that co-finetuning outperforms traditional transfer learning when using the same total amount of data.
We also show how we can easily extend our approach to multiple upstream'' datasets to further improve performance.
arXiv Detail & Related papers (2022-07-08T10:25:47Z) - Incremental Learning Meets Transfer Learning: Application to Multi-site
Prostate MRI Segmentation [16.50535949349874]
We propose a novel multi-site segmentation framework called incremental-transfer learning (ITL)
ITL learns a model from multi-site datasets in an end-to-end sequential fashion.
We show for the first time that leveraging our ITL training scheme is able to alleviate challenging catastrophic problems in incremental learning.
arXiv Detail & Related papers (2022-06-03T02:32:01Z) - Meta-learning Transferable Representations with a Single Target Domain [46.83481356352768]
Fine-tuning and joint training do not always improve accuracy on downstream tasks.
We propose Meta Representation Learning (MeRLin) to learn transferable features.
MeRLin empirically outperforms previous state-of-the-art transfer learning algorithms on various real-world vision and NLP transfer learning benchmarks.
arXiv Detail & Related papers (2020-11-03T01:57:37Z) - Towards Accurate Knowledge Transfer via Target-awareness Representation
Disentanglement [56.40587594647692]
We propose a novel transfer learning algorithm, introducing the idea of Target-awareness REpresentation Disentanglement (TRED)
TRED disentangles the relevant knowledge with respect to the target task from the original source model and used as a regularizer during fine-tuning the target model.
Experiments on various real world datasets show that our method stably improves the standard fine-tuning by more than 2% in average.
arXiv Detail & Related papers (2020-10-16T17:45:08Z) - Uniform Priors for Data-Efficient Transfer [65.086680950871]
We show that features that are most transferable have high uniformity in the embedding space.
We evaluate the regularization on its ability to facilitate adaptation to unseen tasks and data.
arXiv Detail & Related papers (2020-06-30T04:39:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.