Architecture, Dataset and Model-Scale Agnostic Data-free Meta-Learning
- URL: http://arxiv.org/abs/2303.11183v2
- Date: Mon, 19 Jun 2023 14:42:29 GMT
- Title: Architecture, Dataset and Model-Scale Agnostic Data-free Meta-Learning
- Authors: Zixuan Hu, Li Shen, Zhenyi Wang, Tongliang Liu, Chun Yuan, Dacheng Tao
- Abstract summary: We propose ePisode cUrriculum inveRsion (ECI) during data-free meta training and invErsion calibRation following inner loop (ICFIL) during meta testing.
ECI adaptively increases the difficulty level of pseudo episodes according to the real-time feedback of the meta model.
We formulate the optimization process of meta training with ECI as an adversarial form in an end-to-end manner.
- Score: 119.70303730341938
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The goal of data-free meta-learning is to learn useful prior knowledge from a
collection of pre-trained models without accessing their training data.
However, existing works only solve the problem in parameter space, which (i)
ignore the fruitful data knowledge contained in the pre-trained models; (ii)
can not scale to large-scale pre-trained models; (iii) can only meta-learn
pre-trained models with the same network architecture. To address those issues,
we propose a unified framework, dubbed PURER, which contains: (1) ePisode
cUrriculum inveRsion (ECI) during data-free meta training; and (2) invErsion
calibRation following inner loop (ICFIL) during meta testing. During meta
training, we propose ECI to perform pseudo episode training for learning to
adapt fast to new unseen tasks. Specifically, we progressively synthesize a
sequence of pseudo episodes by distilling the training data from each
pre-trained model. The ECI adaptively increases the difficulty level of pseudo
episodes according to the real-time feedback of the meta model. We formulate
the optimization process of meta training with ECI as an adversarial form in an
end-to-end manner. During meta testing, we further propose a simple
plug-and-play supplement-ICFIL-only used during meta testing to narrow the gap
between meta training and meta testing task distribution. Extensive experiments
in various real-world scenarios show the superior performance of ours.
Related papers
- What Do Learning Dynamics Reveal About Generalization in LLM Reasoning? [83.83230167222852]
We find that a model's generalization behavior can be effectively characterized by a training metric we call pre-memorization train accuracy.
By connecting a model's learning behavior to its generalization, pre-memorization train accuracy can guide targeted improvements to training strategies.
arXiv Detail & Related papers (2024-11-12T09:52:40Z) - FREE: Faster and Better Data-Free Meta-Learning [77.90126669914324]
Data-Free Meta-Learning (DFML) aims to extract knowledge from a collection of pre-trained models without requiring the original data.
We introduce the Faster and Better Data-Free Meta-Learning framework, which contains: (i) a meta-generator for rapidly recovering training tasks from pre-trained models; and (ii) a meta-learner for generalizing to new unseen tasks.
arXiv Detail & Related papers (2024-05-02T03:43:19Z) - Boosting Meta-Training with Base Class Information for Few-Shot Learning [35.144099160883606]
We propose an end-to-end training paradigm consisting of two alternative loops.
In the outer loop, we calculate cross entropy loss on the entire training set while updating only the final linear layer.
This training paradigm not only converges quickly but also outperforms existing baselines, indicating that information from the overall training set and the meta-learning training paradigm could mutually reinforce one another.
arXiv Detail & Related papers (2024-03-06T05:13:23Z) - Unsupervised Representation Learning to Aid Semi-Supervised Meta
Learning [16.534014215010757]
We propose a one-shot unsupervised meta-learning to learn latent representation of training samples.
A temperature-scaled cross-entropy loss is used in the inner loop of meta-learning to prevent overfitting.
The proposed method is model agnostic and can aid any meta-learning model to improve accuracy.
arXiv Detail & Related papers (2023-10-19T18:25:22Z) - Meta-Learning with Self-Improving Momentum Target [72.98879709228981]
We propose Self-improving Momentum Target (SiMT) to improve the performance of a meta-learner.
SiMT generates the target model by adapting from the temporal ensemble of the meta-learner.
We show that SiMT brings a significant performance gain when combined with a wide range of meta-learning methods.
arXiv Detail & Related papers (2022-10-11T06:45:15Z) - MetaICL: Learning to Learn In Context [87.23056864536613]
We introduce MetaICL, a new meta-training framework for few-shot learning where a pretrained language model is tuned to do in-context learn-ing on a large set of training tasks.
We show that MetaICL approaches (and sometimes beats) the performance of models fully finetuned on the target task training data, and outperforms much bigger models with nearly 8x parameters.
arXiv Detail & Related papers (2021-10-29T17:42:08Z) - Meta-Regularization by Enforcing Mutual-Exclusiveness [0.8057006406834467]
We propose a regularization technique for meta-learning models that gives the model designer more control over the information flow during meta-training.
Our proposed regularization function shows an accuracy boost of $sim$ $36%$ on the Omniglot dataset.
arXiv Detail & Related papers (2021-01-24T22:57:19Z) - Incremental Meta-Learning via Indirect Discriminant Alignment [118.61152684795178]
We develop a notion of incremental learning during the meta-training phase of meta-learning.
Our approach performs favorably at test time as compared to training a model with the full meta-training set.
arXiv Detail & Related papers (2020-02-11T01:39:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.