Understanding Benign Overfitting in Nested Meta Learning
- URL: http://arxiv.org/abs/2206.13482v1
- Date: Mon, 27 Jun 2022 17:46:57 GMT
- Title: Understanding Benign Overfitting in Nested Meta Learning
- Authors: Lisha Chen, Songtao Lu, Tianyi Chen
- Abstract summary: We focus on the meta learning settings with a challenging nested structure that we term the nested meta learning.
Our theory contributes to understanding the delicate interplay among data heterogeneity, model adaptation and benign overfitting in nested meta learning tasks.
- Score: 44.08080315581763
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Meta learning has demonstrated tremendous success in few-shot learning with
limited supervised data. In those settings, the meta model is usually
overparameterized. While the conventional statistical learning theory suggests
that overparameterized models tend to overfit, empirical evidence reveals that
overparameterized meta learning methods still work well -- a phenomenon often
called ``benign overfitting.'' To understand this phenomenon, we focus on the
meta learning settings with a challenging nested structure that we term the
nested meta learning, and analyze its generalization performance under an
overparameterized meta learning model. While our analysis uses the relatively
tractable linear models, our theory contributes to understanding the delicate
interplay among data heterogeneity, model adaptation and benign overfitting in
nested meta learning tasks. We corroborate our theoretical claims through
numerical simulations.
Related papers
- Meta-Learning Neural Mechanisms rather than Bayesian Priors [4.451173777061901]
We investigate the meta-learning of formal languages and find that, contrary to previous claims, meta-trained models are not learning simplicity-based priors.
We find evidence that meta-training imprints neural mechanisms into the model, which function like cognitive primitives for the network on downstream tasks.
arXiv Detail & Related papers (2025-03-20T11:33:59Z) - Theoretical Characterization of the Generalization Performance of
Overfitted Meta-Learning [70.52689048213398]
This paper studies the performance of overfitted meta-learning under a linear regression model with Gaussian features.
We find new and interesting properties that do not exist in single-task linear regression.
Our analysis suggests that benign overfitting is more significant and easier to observe when the noise and the diversity/fluctuation of the ground truth of each training task are large.
arXiv Detail & Related papers (2023-04-09T20:36:13Z) - Meta-Learning with Self-Improving Momentum Target [72.98879709228981]
We propose Self-improving Momentum Target (SiMT) to improve the performance of a meta-learner.
SiMT generates the target model by adapting from the temporal ensemble of the meta-learner.
We show that SiMT brings a significant performance gain when combined with a wide range of meta-learning methods.
arXiv Detail & Related papers (2022-10-11T06:45:15Z) - Provable Generalization of Overparameterized Meta-learning Trained with
SGD [62.892930625034374]
We study the generalization of a widely used meta-learning approach, Model-Agnostic Meta-Learning (MAML)
We provide both upper and lower bounds for the excess risk of MAML, which captures how SGD dynamics affect these generalization bounds.
Our theoretical findings are further validated by experiments.
arXiv Detail & Related papers (2022-06-18T07:22:57Z) - Generalization Bounds For Meta-Learning: An Information-Theoretic
Analysis [8.028776552383365]
We propose a generic understanding of both the conventional learning-to-learn framework and the modern model-agnostic meta-learning algorithms.
We provide a data-dependent generalization bound for a variant of MAML, which is non-vacuous for deep few-shot learning.
arXiv Detail & Related papers (2021-09-29T17:45:54Z) - On Fast Adversarial Robustness Adaptation in Model-Agnostic
Meta-Learning [100.14809391594109]
Model-agnostic meta-learning (MAML) has emerged as one of the most successful meta-learning techniques in few-shot learning.
Despite the generalization power of the meta-model, it remains elusive that how adversarial robustness can be maintained by MAML in few-shot learning.
We propose a general but easily-optimized robustness-regularized meta-learning framework, which allows the use of unlabeled data augmentation, fast adversarial attack generation, and computationally-light fine-tuning.
arXiv Detail & Related papers (2021-02-20T22:03:04Z) - Meta-Learning Requires Meta-Augmentation [13.16019567695033]
We describe two forms of metalearning overfitting, and show that they appear experimentally in common benchmarks.
We then use an information-theoretic framework to discuss meta-augmentation, a way to add randomness that discourages the base learner and model from learning trivial solutions.
We demonstrate that meta-augmentation produces large complementary benefits to recently proposed meta-regularization techniques.
arXiv Detail & Related papers (2020-07-10T18:04:04Z) - Revisiting Meta-Learning as Supervised Learning [69.2067288158133]
We aim to provide a principled, unifying framework by revisiting and strengthening the connection between meta-learning and traditional supervised learning.
By treating pairs of task-specific data sets and target models as (feature, label) samples, we can reduce many meta-learning algorithms to instances of supervised learning.
This view not only unifies meta-learning into an intuitive and practical framework but also allows us to transfer insights from supervised learning directly to improve meta-learning.
arXiv Detail & Related papers (2020-02-03T06:13:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.