Related papers: Understanding Benign Overfitting in Nested Meta Learning

Understanding Benign Overfitting in Nested Meta Learning

URL: http://arxiv.org/abs/2206.13482v1
Date: Mon, 27 Jun 2022 17:46:57 GMT
Title: Understanding Benign Overfitting in Nested Meta Learning
Authors: Lisha Chen, Songtao Lu, Tianyi Chen
Abstract summary: We focus on the meta learning settings with a challenging nested structure that we term the nested meta learning. Our theory contributes to understanding the delicate interplay among data heterogeneity, model adaptation and benign overfitting in nested meta learning tasks.
Score: 44.08080315581763
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Meta learning has demonstrated tremendous success in few-shot learning with limited supervised data. In those settings, the meta model is usually overparameterized. While the conventional statistical learning theory suggests that overparameterized models tend to overfit, empirical evidence reveals that overparameterized meta learning methods still work well -- a phenomenon often called ``benign overfitting.'' To understand this phenomenon, we focus on the meta learning settings with a challenging nested structure that we term the nested meta learning, and analyze its generalization performance under an overparameterized meta learning model. While our analysis uses the relatively tractable linear models, our theory contributes to understanding the delicate interplay among data heterogeneity, model adaptation and benign overfitting in nested meta learning tasks. We corroborate our theoretical claims through numerical simulations.

Related papers

Meta-Learning Neural Mechanisms rather than Bayesian Priors [4.451173777061901]
We investigate the meta-learning of formal languages and find that, contrary to previous claims, meta-trained models are not learning simplicity-based priors. We find evidence that meta-training imprints neural mechanisms into the model, which function like cognitive primitives for the network on downstream tasks.
arXiv Detail & Related papers (2025-03-20T11:33:59Z)
Rethinking Meta-Learning from a Learning Lens [35.98940987691948]
We consider how to bridge the gap between theoretical understanding and practical implementation of meta-learning.<n>We propose TRLearner, a plug-and-play method that leverages task relation to calibrate meta-learning.
arXiv Detail & Related papers (2024-09-13T02:00:16Z)
Theoretical Characterization of the Generalization Performance of Overfitted Meta-Learning [70.52689048213398]
This paper studies the performance of overfitted meta-learning under a linear regression model with Gaussian features. We find new and interesting properties that do not exist in single-task linear regression. Our analysis suggests that benign overfitting is more significant and easier to observe when the noise and the diversity/fluctuation of the ground truth of each training task are large.
arXiv Detail & Related papers (2023-04-09T20:36:13Z)
Meta-Learning with Self-Improving Momentum Target [72.98879709228981]
We propose Self-improving Momentum Target (SiMT) to improve the performance of a meta-learner. SiMT generates the target model by adapting from the temporal ensemble of the meta-learner. We show that SiMT brings a significant performance gain when combined with a wide range of meta-learning methods.
arXiv Detail & Related papers (2022-10-11T06:45:15Z)
Provable Generalization of Overparameterized Meta-learning Trained with SGD [62.892930625034374]
We study the generalization of a widely used meta-learning approach, Model-Agnostic Meta-Learning (MAML) We provide both upper and lower bounds for the excess risk of MAML, which captures how SGD dynamics affect these generalization bounds. Our theoretical findings are further validated by experiments.
arXiv Detail & Related papers (2022-06-18T07:22:57Z)
Generalization Bounds For Meta-Learning: An Information-Theoretic Analysis [8.028776552383365]
We propose a generic understanding of both the conventional learning-to-learn framework and the modern model-agnostic meta-learning algorithms. We provide a data-dependent generalization bound for a variant of MAML, which is non-vacuous for deep few-shot learning.
arXiv Detail & Related papers (2021-09-29T17:45:54Z)
On Fast Adversarial Robustness Adaptation in Model-Agnostic Meta-Learning [100.14809391594109]
Model-agnostic meta-learning (MAML) has emerged as one of the most successful meta-learning techniques in few-shot learning. Despite the generalization power of the meta-model, it remains elusive that how adversarial robustness can be maintained by MAML in few-shot learning. We propose a general but easily-optimized robustness-regularized meta-learning framework, which allows the use of unlabeled data augmentation, fast adversarial attack generation, and computationally-light fine-tuning.
arXiv Detail & Related papers (2021-02-20T22:03:04Z)
Meta-Learning Requires Meta-Augmentation [13.16019567695033]
We describe two forms of metalearning overfitting, and show that they appear experimentally in common benchmarks. We then use an information-theoretic framework to discuss meta-augmentation, a way to add randomness that discourages the base learner and model from learning trivial solutions. We demonstrate that meta-augmentation produces large complementary benefits to recently proposed meta-regularization techniques.
arXiv Detail & Related papers (2020-07-10T18:04:04Z)
Revisiting Meta-Learning as Supervised Learning [69.2067288158133]
We aim to provide a principled, unifying framework by revisiting and strengthening the connection between meta-learning and traditional supervised learning. By treating pairs of task-specific data sets and target models as (feature, label) samples, we can reduce many meta-learning algorithms to instances of supervised learning. This view not only unifies meta-learning into an intuitive and practical framework but also allows us to transfer insights from supervised learning directly to improve meta-learning.
arXiv Detail & Related papers (2020-02-03T06:13:01Z)

This list is automatically generated from the titles and abstracts of the papers in this site.