Sharp-MAML: Sharpness-Aware Model-Agnostic Meta Learning
- URL: http://arxiv.org/abs/2206.03996v2
- Date: Fri, 10 Jun 2022 15:48:11 GMT
- Title: Sharp-MAML: Sharpness-Aware Model-Agnostic Meta Learning
- Authors: Momin Abbas, Quan Xiao, Lisha Chen, Pin-Yu Chen, Tianyi Chen
- Abstract summary: We develop a sharpness-aware MAML approach that we term Sharp-MAML.
We empirically demonstrate that Sharp-MAML and its computation-efficient variant can outperform popular existing MAML baselines.
This is the first empirical and theoretical study on sharpness-aware minimization in the context of bilevel learning.
- Score: 71.26635165491105
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Model-agnostic meta learning (MAML) is currently one of the dominating
approaches for few-shot meta-learning. Albeit its effectiveness, the
optimization of MAML can be challenging due to the innate bilevel problem
structure. Specifically, the loss landscape of MAML is much more complex with
possibly more saddle points and local minimizers than its empirical risk
minimization counterpart. To address this challenge, we leverage the recently
invented sharpness-aware minimization and develop a sharpness-aware MAML
approach that we term Sharp-MAML. We empirically demonstrate that Sharp-MAML
and its computation-efficient variant can outperform popular existing MAML
baselines (e.g., $+12\%$ accuracy on Mini-Imagenet). We complement the
empirical study with the convergence rate analysis and the generalization bound
of Sharp-MAML. To the best of our knowledge, this is the first empirical and
theoretical study on sharpness-aware minimization in the context of bilevel
learning. The code is available at https://github.com/mominabbass/Sharp-MAML.
Related papers
- Improved Meta Learning for Low Resource Speech Recognition [15.612232220719653]
We propose a new meta learning based framework for low resource speech recognition that improves the previous model meta learning (MAML) approach.
Our proposed system outperforms MAML based low resource ASR system on various languages in terms of character error rates and stable training behavior.
arXiv Detail & Related papers (2022-05-11T15:50:47Z) - Model-Agnostic Multitask Fine-tuning for Few-shot Vision-Language
Transfer Learning [59.38343286807997]
We propose Model-Agnostic Multitask Fine-tuning (MAMF) for vision-language models on unseen tasks.
Compared with model-agnostic meta-learning (MAML), MAMF discards the bi-level optimization and uses only first-order gradients.
We show that MAMF consistently outperforms the classical fine-tuning method for few-shot transfer learning on five benchmark datasets.
arXiv Detail & Related papers (2022-03-09T17:26:53Z) - Is Bayesian Model-Agnostic Meta Learning Better than Model-Agnostic Meta
Learning, Provably? [25.00480072097939]
We compare the meta test risks of model agnostic meta learning (MAML) and Bayesian MAML.
Under both the distribution agnostic and linear centroid cases, we have established that Bayesian MAML indeed has provably lower meta test risks than MAML.
arXiv Detail & Related papers (2022-03-06T21:38:18Z) - Sign-MAML: Efficient Model-Agnostic Meta-Learning by SignSGD [16.417263188143313]
We propose a new computationally-efficient first-order algorithm for Model-Agnostic Meta-Learning (MAML)
We show that MAML, through the lens of signSGD-oriented BLO, naturally yields an alternating optimization scheme that just requires first-order gradients of a learned meta-model.
In practice, we show that Sign-MAML outperforms FO-MAML in various few-shot image classification tasks.
arXiv Detail & Related papers (2021-09-15T18:01:55Z) - MAML is a Noisy Contrastive Learner [72.04430033118426]
Model-agnostic meta-learning (MAML) is one of the most popular and widely-adopted meta-learning algorithms nowadays.
We provide a new perspective to the working mechanism of MAML and discover that: MAML is analogous to a meta-learner using a supervised contrastive objective function.
We propose a simple but effective technique, zeroing trick, to alleviate such interference.
arXiv Detail & Related papers (2021-06-29T12:52:26Z) - Memory-Based Optimization Methods for Model-Agnostic Meta-Learning and
Personalized Federated Learning [56.17603785248675]
Model-agnostic meta-learning (MAML) has become a popular research area.
Existing MAML algorithms rely on the episode' idea by sampling a few tasks and data points to update the meta-model at each iteration.
This paper proposes memory-based algorithms for MAML that converge with vanishing error.
arXiv Detail & Related papers (2021-06-09T08:47:58Z) - On Fast Adversarial Robustness Adaptation in Model-Agnostic
Meta-Learning [100.14809391594109]
Model-agnostic meta-learning (MAML) has emerged as one of the most successful meta-learning techniques in few-shot learning.
Despite the generalization power of the meta-model, it remains elusive that how adversarial robustness can be maintained by MAML in few-shot learning.
We propose a general but easily-optimized robustness-regularized meta-learning framework, which allows the use of unlabeled data augmentation, fast adversarial attack generation, and computationally-light fine-tuning.
arXiv Detail & Related papers (2021-02-20T22:03:04Z) - How Does the Task Landscape Affect MAML Performance? [42.27488241647739]
We show that Model-Agnostic Meta-Learning (MAML) is more difficult to optimize than non-adaptive learning (NAL)
We analytically address this issue in a linear regression setting consisting of a mixture of easy and hard tasks.
We also give numerical and analytical results suggesting that these insights apply to two-layer neural networks.
arXiv Detail & Related papers (2020-10-27T23:54:44Z) - Weighted Meta-Learning [21.522768804834616]
Many popular meta-learning algorithms, such as model-agnostic meta-learning (MAML), only assume access to the target samples for fine-tuning.
In this work, we provide a general framework for meta-learning based on weighting the loss of different source tasks.
We develop a learning algorithm based on minimizing the error bound with respect to an empirical IPM, including a weighted MAML algorithm.
arXiv Detail & Related papers (2020-03-20T19:00:42Z) - Theoretical Convergence of Multi-Step Model-Agnostic Meta-Learning [63.64636047748605]
We develop a new theoretical framework to provide convergence guarantee for the general multi-step MAML algorithm.
In particular, our results suggest that an inner-stage step needs to be chosen inversely proportional to $N$ of inner-stage steps in order for $N$ MAML to have guaranteed convergence.
arXiv Detail & Related papers (2020-02-18T19:17:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.