Meta-Learning with Less Forgetting on Large-Scale Non-Stationary Task
Distributions
- URL: http://arxiv.org/abs/2209.01501v1
- Date: Sat, 3 Sep 2022 21:22:14 GMT
- Title: Meta-Learning with Less Forgetting on Large-Scale Non-Stationary Task
Distributions
- Authors: Zhenyi Wang, Li Shen, Le Fang, Qiuling Suo, Donglin Zhan, Tiehang
Duan, Mingchen Gao
- Abstract summary: We name this problem as Semi-supervised meta-learning with Evolving Task diStributions, abbreviated as SETS.
We propose an OOD Robust and knowleDge presErved semi-supeRvised meta-learning approach (ORDER) to tackle these two major challenges.
Specifically, our ORDER introduces a novel mutual information regularization to robustify the model with unlabeled OOD data and adopts an optimal transport regularization to remember previously learned knowledge in feature space.
- Score: 8.88133567816717
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The paradigm of machine intelligence moves from purely supervised learning to
a more practical scenario when many loosely related unlabeled data are
available and labeled data is scarce. Most existing algorithms assume that the
underlying task distribution is stationary. Here we consider a more realistic
and challenging setting in that task distributions evolve over time. We name
this problem as Semi-supervised meta-learning with Evolving Task diStributions,
abbreviated as SETS. Two key challenges arise in this more realistic setting:
(i) how to use unlabeled data in the presence of a large amount of unlabeled
out-of-distribution (OOD) data; and (ii) how to prevent catastrophic forgetting
on previously learned task distributions due to the task distribution shift. We
propose an OOD Robust and knowleDge presErved semi-supeRvised meta-learning
approach (ORDER), to tackle these two major challenges. Specifically, our ORDER
introduces a novel mutual information regularization to robustify the model
with unlabeled OOD data and adopts an optimal transport regularization to
remember previously learned knowledge in feature space. In addition, we test
our method on a very challenging dataset: SETS on large-scale non-stationary
semi-supervised task distributions consisting of (at least) 72K tasks. With
extensive experiments, we demonstrate the proposed ORDER alleviates forgetting
on evolving task distributions and is more robust to OOD data than related
strong baselines.
Related papers
- Semi-Supervised One-Shot Imitation Learning [83.94646047695412]
One-shot Imitation Learning aims to imbue AI agents with the ability to learn a new task from a single demonstration.
We introduce the semi-supervised OSIL problem setting, where the learning agent is presented with a large dataset of trajectories.
We develop an algorithm specifically applicable to this semi-supervised OSIL setting.
arXiv Detail & Related papers (2024-08-09T18:11:26Z) - Data-CUBE: Data Curriculum for Instruction-based Sentence Representation
Learning [85.66907881270785]
We propose a data curriculum method, namely Data-CUBE, that arranges the orders of all the multi-task data for training.
In the task level, we aim to find the optimal task order to minimize the total cross-task interference risk.
In the instance level, we measure the difficulty of all instances per task, then divide them into the easy-to-difficult mini-batches for training.
arXiv Detail & Related papers (2024-01-07T18:12:20Z) - Task-Distributionally Robust Data-Free Meta-Learning [99.56612787882334]
Data-Free Meta-Learning (DFML) aims to efficiently learn new tasks by leveraging multiple pre-trained models without requiring their original training data.
For the first time, we reveal two major challenges hindering their practical deployments: Task-Distribution Shift ( TDS) and Task-Distribution Corruption (TDC)
arXiv Detail & Related papers (2023-11-23T15:46:54Z) - Disentangled Latent Spaces Facilitate Data-Driven Auxiliary Learning [15.41342100228504]
In deep learning, auxiliary objectives are often used to facilitate learning in situations where data is scarce.
We propose a novel framework, dubbed Detaux, whereby a weakly supervised disentanglement procedure is used to discover new unrelated classification tasks.
arXiv Detail & Related papers (2023-10-13T17:40:39Z) - Task Compass: Scaling Multi-task Pre-training with Task Prefix [122.49242976184617]
Existing studies show that multi-task learning with large-scale supervised tasks suffers from negative effects across tasks.
We propose a task prefix guided multi-task pre-training framework to explore the relationships among tasks.
Our model can not only serve as the strong foundation backbone for a wide range of tasks but also be feasible as a probing tool for analyzing task relationships.
arXiv Detail & Related papers (2022-10-12T15:02:04Z) - Uncertainty-Aware Meta-Learning for Multimodal Task Distributions [3.7470451129384825]
We present UnLiMiTD (uncertainty-aware meta-learning for multimodal task distributions)
We take a probabilistic perspective and train a parametric, tuneable distribution over tasks on the meta-dataset.
We demonstrate that UnLiMiTD's predictions compare favorably to, and outperform in most cases, the standard baselines.
arXiv Detail & Related papers (2022-10-04T20:02:25Z) - Learning Invariant Representation with Consistency and Diversity for
Semi-supervised Source Hypothesis Transfer [46.68586555288172]
We propose a novel task named Semi-supervised Source Hypothesis Transfer (SSHT), which performs domain adaptation based on source trained model, to generalize well in target domain with a few supervisions.
We propose Consistency and Diversity Learning (CDL), a simple but effective framework for SSHT by facilitating prediction consistency between two randomly augmented unlabeled data.
Experimental results show that our method outperforms existing SSDA methods and unsupervised model adaptation methods on DomainNet, Office-Home and Office-31 datasets.
arXiv Detail & Related papers (2021-07-07T04:14:24Z) - Towards Accurate Knowledge Transfer via Target-awareness Representation
Disentanglement [56.40587594647692]
We propose a novel transfer learning algorithm, introducing the idea of Target-awareness REpresentation Disentanglement (TRED)
TRED disentangles the relevant knowledge with respect to the target task from the original source model and used as a regularizer during fine-tuning the target model.
Experiments on various real world datasets show that our method stably improves the standard fine-tuning by more than 2% in average.
arXiv Detail & Related papers (2020-10-16T17:45:08Z) - Learning Task-oriented Disentangled Representations for Unsupervised
Domain Adaptation [165.61511788237485]
Unsupervised domain adaptation (UDA) aims to address the domain-shift problem between a labeled source domain and an unlabeled target domain.
We propose a dynamic task-oriented disentangling network (DTDN) to learn disentangled representations in an end-to-end fashion for UDA.
arXiv Detail & Related papers (2020-07-27T01:21:18Z) - Task-Aware Variational Adversarial Active Learning [42.334671410592065]
We propose task-aware variational adversarial AL (TA-VAAL) that modifies task-agnostic VAAL.
Our proposed TA-VAAL outperforms state-of-the-arts on various benchmark datasets for classifications with balanced / imbalanced labels.
arXiv Detail & Related papers (2020-02-11T22:00:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.