Improving Meta-Learning Generalization with Activation-Based
Early-Stopping
- URL: http://arxiv.org/abs/2208.02377v1
- Date: Wed, 3 Aug 2022 22:55:45 GMT
- Title: Improving Meta-Learning Generalization with Activation-Based
Early-Stopping
- Authors: Simon Guiroy, Christopher Pal, Gon\c{c}alo Mordido, Sarath Chandar
- Abstract summary: Meta-Learning algorithms for few-shot learning aim to train neural networks capable of generalizing to novel tasks using only a few examples.
Early-stopping is critical for performance, halting model training when it reaches optimal generalization to the new task distribution.
This is problematic in few-shot transfer learning settings, where the meta-test set comes from a different target dataset.
- Score: 12.299371455015239
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Meta-Learning algorithms for few-shot learning aim to train neural networks
capable of generalizing to novel tasks using only a few examples.
Early-stopping is critical for performance, halting model training when it
reaches optimal generalization to the new task distribution. Early-stopping
mechanisms in Meta-Learning typically rely on measuring the model performance
on labeled examples from a meta-validation set drawn from the training (source)
dataset. This is problematic in few-shot transfer learning settings, where the
meta-test set comes from a different target dataset (OOD) and can potentially
have a large distributional shift with the meta-validation set. In this work,
we propose Activation Based Early-stopping (ABE), an alternative to using
validation-based early-stopping for meta-learning. Specifically, we analyze the
evolution, during meta-training, of the neural activations at each hidden
layer, on a small set of unlabelled support examples from a single task of the
target tasks distribution, as this constitutes a minimal and justifiably
accessible information from the target problem. Our experiments show that
simple, label agnostic statistics on the activations offer an effective way to
estimate how the target generalization evolves over time. At each hidden layer,
we characterize the activation distributions, from their first and second order
moments, then further summarized along the feature dimensions, resulting in a
compact yet intuitive characterization in a four-dimensional space. Detecting
when, throughout training time, and at which layer, the target activation
trajectory diverges from the activation trajectory of the source data, allows
us to perform early-stopping and improve generalization in a large array of
few-shot transfer learning settings, across different algorithms, source and
target datasets.
Related papers
- TIDo: Source-free Task Incremental Learning in Non-stationary
Environments [0.0]
Updating a model-based agent to learn new target tasks requires us to store past training data.
Few-shot task incremental learning methods overcome the limitation of labeled target datasets.
We propose a one-shot task incremental learning approach that can adapt to non-stationary source and target tasks.
arXiv Detail & Related papers (2023-01-28T02:19:45Z) - Meta-Learning with Self-Improving Momentum Target [72.98879709228981]
We propose Self-improving Momentum Target (SiMT) to improve the performance of a meta-learner.
SiMT generates the target model by adapting from the temporal ensemble of the meta-learner.
We show that SiMT brings a significant performance gain when combined with a wide range of meta-learning methods.
arXiv Detail & Related papers (2022-10-11T06:45:15Z) - Diverse Distributions of Self-Supervised Tasks for Meta-Learning in NLP [39.457091182683406]
We aim to provide task distributions for meta-learning by considering self-supervised tasks automatically proposed from unlabeled text.
Our analysis shows that all these factors meaningfully alter the task distribution, some inducing significant improvements in downstream few-shot accuracy of the meta-learned models.
arXiv Detail & Related papers (2021-11-02T01:50:09Z) - Learning Prototype-oriented Set Representations for Meta-Learning [85.19407183975802]
Learning from set-structured data is a fundamental problem that has recently attracted increasing attention.
This paper provides a novel optimal transport based way to improve existing summary networks.
We further instantiate it to the cases of few-shot classification and implicit meta generative modeling.
arXiv Detail & Related papers (2021-10-18T09:49:05Z) - Meta-Learning with Fewer Tasks through Task Interpolation [67.03769747726666]
Current meta-learning algorithms require a large number of meta-training tasks, which may not be accessible in real-world scenarios.
By meta-learning with task gradient (MLTI), our approach effectively generates additional tasks by randomly sampling a pair of tasks and interpolating the corresponding features and labels.
Empirically, in our experiments on eight datasets from diverse domains, we find that the proposed general MLTI framework is compatible with representative meta-learning algorithms and consistently outperforms other state-of-the-art strategies.
arXiv Detail & Related papers (2021-06-04T20:15:34Z) - Meta-learning One-class Classifiers with Eigenvalue Solvers for
Supervised Anomaly Detection [55.888835686183995]
We propose a neural network-based meta-learning method for supervised anomaly detection.
We experimentally demonstrate that the proposed method achieves better performance than existing anomaly detection and few-shot learning methods.
arXiv Detail & Related papers (2021-03-01T01:43:04Z) - Meta-Regularization by Enforcing Mutual-Exclusiveness [0.8057006406834467]
We propose a regularization technique for meta-learning models that gives the model designer more control over the information flow during meta-training.
Our proposed regularization function shows an accuracy boost of $sim$ $36%$ on the Omniglot dataset.
arXiv Detail & Related papers (2021-01-24T22:57:19Z) - Pre-training Text Representations as Meta Learning [113.3361289756749]
We introduce a learning algorithm which directly optimize model's ability to learn text representations for effective learning of downstream tasks.
We show that there is an intrinsic connection between multi-task pre-training and model-agnostic meta-learning with a sequence of meta-train steps.
arXiv Detail & Related papers (2020-04-12T09:05:47Z) - Incremental Meta-Learning via Indirect Discriminant Alignment [118.61152684795178]
We develop a notion of incremental learning during the meta-training phase of meta-learning.
Our approach performs favorably at test time as compared to training a model with the full meta-training set.
arXiv Detail & Related papers (2020-02-11T01:39:12Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.