Sign-MAML: Efficient Model-Agnostic Meta-Learning by SignSGD
- URL: http://arxiv.org/abs/2109.07497v1
- Date: Wed, 15 Sep 2021 18:01:55 GMT
- Title: Sign-MAML: Efficient Model-Agnostic Meta-Learning by SignSGD
- Authors: Chen Fan, Parikshit Ram, Sijia Liu
- Abstract summary: We propose a new computationally-efficient first-order algorithm for Model-Agnostic Meta-Learning (MAML)
We show that MAML, through the lens of signSGD-oriented BLO, naturally yields an alternating optimization scheme that just requires first-order gradients of a learned meta-model.
In practice, we show that Sign-MAML outperforms FO-MAML in various few-shot image classification tasks.
- Score: 16.417263188143313
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We propose a new computationally-efficient first-order algorithm for
Model-Agnostic Meta-Learning (MAML). The key enabling technique is to interpret
MAML as a bilevel optimization (BLO) problem and leverage the sign-based
SGD(signSGD) as a lower-level optimizer of BLO. We show that MAML, through the
lens of signSGD-oriented BLO, naturally yields an alternating optimization
scheme that just requires first-order gradients of a learned meta-model. We
term the resulting MAML algorithm Sign-MAML. Compared to the conventional
first-order MAML (FO-MAML) algorithm, Sign-MAML is theoretically-grounded as it
does not impose any assumption on the absence of second-order derivatives
during meta training. In practice, we show that Sign-MAML outperforms FO-MAML
in various few-shot image classification tasks, and compared to MAML, it
achieves a much more graceful tradeoff between classification accuracy and
computation efficiency.
Related papers
- Sharp-MAML: Sharpness-Aware Model-Agnostic Meta Learning [71.26635165491105]
We develop a sharpness-aware MAML approach that we term Sharp-MAML.
We empirically demonstrate that Sharp-MAML and its computation-efficient variant can outperform popular existing MAML baselines.
This is the first empirical and theoretical study on sharpness-aware minimization in the context of bilevel learning.
arXiv Detail & Related papers (2022-06-08T16:20:11Z) - Model-Agnostic Multitask Fine-tuning for Few-shot Vision-Language
Transfer Learning [59.38343286807997]
We propose Model-Agnostic Multitask Fine-tuning (MAMF) for vision-language models on unseen tasks.
Compared with model-agnostic meta-learning (MAML), MAMF discards the bi-level optimization and uses only first-order gradients.
We show that MAMF consistently outperforms the classical fine-tuning method for few-shot transfer learning on five benchmark datasets.
arXiv Detail & Related papers (2022-03-09T17:26:53Z) - Is Bayesian Model-Agnostic Meta Learning Better than Model-Agnostic Meta
Learning, Provably? [25.00480072097939]
We compare the meta test risks of model agnostic meta learning (MAML) and Bayesian MAML.
Under both the distribution agnostic and linear centroid cases, we have established that Bayesian MAML indeed has provably lower meta test risks than MAML.
arXiv Detail & Related papers (2022-03-06T21:38:18Z) - MAML is a Noisy Contrastive Learner [72.04430033118426]
Model-agnostic meta-learning (MAML) is one of the most popular and widely-adopted meta-learning algorithms nowadays.
We provide a new perspective to the working mechanism of MAML and discover that: MAML is analogous to a meta-learner using a supervised contrastive objective function.
We propose a simple but effective technique, zeroing trick, to alleviate such interference.
arXiv Detail & Related papers (2021-06-29T12:52:26Z) - Memory-Based Optimization Methods for Model-Agnostic Meta-Learning and
Personalized Federated Learning [56.17603785248675]
Model-agnostic meta-learning (MAML) has become a popular research area.
Existing MAML algorithms rely on the episode' idea by sampling a few tasks and data points to update the meta-model at each iteration.
This paper proposes memory-based algorithms for MAML that converge with vanishing error.
arXiv Detail & Related papers (2021-06-09T08:47:58Z) - On Fast Adversarial Robustness Adaptation in Model-Agnostic
Meta-Learning [100.14809391594109]
Model-agnostic meta-learning (MAML) has emerged as one of the most successful meta-learning techniques in few-shot learning.
Despite the generalization power of the meta-model, it remains elusive that how adversarial robustness can be maintained by MAML in few-shot learning.
We propose a general but easily-optimized robustness-regularized meta-learning framework, which allows the use of unlabeled data augmentation, fast adversarial attack generation, and computationally-light fine-tuning.
arXiv Detail & Related papers (2021-02-20T22:03:04Z) - B-SMALL: A Bayesian Neural Network approach to Sparse Model-Agnostic
Meta-Learning [2.9189409618561966]
We propose a Bayesian neural network based MAML algorithm, which we refer to as the B-SMALL algorithm.
We demonstrate the performance of B-MAML using classification and regression tasks, and highlight that training a sparsifying BNN using MAML indeed improves the parameter footprint of the model.
arXiv Detail & Related papers (2021-01-01T09:19:48Z) - Meta Learning in the Continuous Time Limit [36.23467808322093]
We establish the ordinary differential equation (ODE) that underlies the training dynamics of Model-A Meta-Learning (MAML)
We propose a new BI-MAML training algorithm that significantly reduces the computational burden associated with existing MAML training methods.
arXiv Detail & Related papers (2020-06-19T01:47:31Z) - Theoretical Convergence of Multi-Step Model-Agnostic Meta-Learning [63.64636047748605]
We develop a new theoretical framework to provide convergence guarantee for the general multi-step MAML algorithm.
In particular, our results suggest that an inner-stage step needs to be chosen inversely proportional to $N$ of inner-stage steps in order for $N$ MAML to have guaranteed convergence.
arXiv Detail & Related papers (2020-02-18T19:17:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.