Theoretical Convergence of Multi-Step Model-Agnostic Meta-Learning
- URL: http://arxiv.org/abs/2002.07836v3
- Date: Mon, 13 Jul 2020 04:03:09 GMT
- Title: Theoretical Convergence of Multi-Step Model-Agnostic Meta-Learning
- Authors: Kaiyi Ji, Junjie Yang, Yingbin Liang
- Abstract summary: We develop a new theoretical framework to provide convergence guarantee for the general multi-step MAML algorithm.
In particular, our results suggest that an inner-stage step needs to be chosen inversely proportional to $N$ of inner-stage steps in order for $N$ MAML to have guaranteed convergence.
- Score: 63.64636047748605
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: As a popular meta-learning approach, the model-agnostic meta-learning (MAML)
algorithm has been widely used due to its simplicity and effectiveness.
However, the convergence of the general multi-step MAML still remains
unexplored. In this paper, we develop a new theoretical framework to provide
such convergence guarantee for two types of objective functions that are of
interest in practice: (a) resampling case (e.g., reinforcement learning), where
loss functions take the form in expectation and new data are sampled as the
algorithm runs; and (b) finite-sum case (e.g., supervised learning), where loss
functions take the finite-sum form with given samples. For both cases, we
characterize the convergence rate and the computational complexity to attain an
$\epsilon$-accurate solution for multi-step MAML in the general nonconvex
setting. In particular, our results suggest that an inner-stage stepsize needs
to be chosen inversely proportional to the number $N$ of inner-stage steps in
order for $N$-step MAML to have guaranteed convergence. From the technical
perspective, we develop novel techniques to deal with the nested structure of
the meta gradient for multi-step MAML, which can be of independent interest.
Related papers
- A Stochastic Approach to Bi-Level Optimization for Hyperparameter Optimization and Meta Learning [74.80956524812714]
We tackle the general differentiable meta learning problem that is ubiquitous in modern deep learning.
These problems are often formalized as Bi-Level optimizations (BLO)
We introduce a novel perspective by turning a given BLO problem into a ii optimization, where the inner loss function becomes a smooth distribution, and the outer loss becomes an expected loss over the inner distribution.
arXiv Detail & Related papers (2024-10-14T12:10:06Z) - MAML is a Noisy Contrastive Learner [72.04430033118426]
Model-agnostic meta-learning (MAML) is one of the most popular and widely-adopted meta-learning algorithms nowadays.
We provide a new perspective to the working mechanism of MAML and discover that: MAML is analogous to a meta-learner using a supervised contrastive objective function.
We propose a simple but effective technique, zeroing trick, to alleviate such interference.
arXiv Detail & Related papers (2021-06-29T12:52:26Z) - Memory-Based Optimization Methods for Model-Agnostic Meta-Learning and
Personalized Federated Learning [56.17603785248675]
Model-agnostic meta-learning (MAML) has become a popular research area.
Existing MAML algorithms rely on the episode' idea by sampling a few tasks and data points to update the meta-model at each iteration.
This paper proposes memory-based algorithms for MAML that converge with vanishing error.
arXiv Detail & Related papers (2021-06-09T08:47:58Z) - Meta-Learning with Neural Tangent Kernels [58.06951624702086]
We propose the first meta-learning paradigm in the Reproducing Kernel Hilbert Space (RKHS) induced by the meta-model's Neural Tangent Kernel (NTK)
Within this paradigm, we introduce two meta-learning algorithms, which no longer need a sub-optimal iterative inner-loop adaptation as in the MAML framework.
We achieve this goal by 1) replacing the adaptation with a fast-adaptive regularizer in the RKHS; and 2) solving the adaptation analytically based on the NTK theory.
arXiv Detail & Related papers (2021-02-07T20:53:23Z) - B-SMALL: A Bayesian Neural Network approach to Sparse Model-Agnostic
Meta-Learning [2.9189409618561966]
We propose a Bayesian neural network based MAML algorithm, which we refer to as the B-SMALL algorithm.
We demonstrate the performance of B-MAML using classification and regression tasks, and highlight that training a sparsifying BNN using MAML indeed improves the parameter footprint of the model.
arXiv Detail & Related papers (2021-01-01T09:19:48Z) - Meta Learning in the Continuous Time Limit [36.23467808322093]
We establish the ordinary differential equation (ODE) that underlies the training dynamics of Model-A Meta-Learning (MAML)
We propose a new BI-MAML training algorithm that significantly reduces the computational burden associated with existing MAML training methods.
arXiv Detail & Related papers (2020-06-19T01:47:31Z) - Convergence of Meta-Learning with Task-Specific Adaptation over Partial
Parameters [152.03852111442114]
Although model-agnostic metalearning (MAML) is a very successful algorithm meta-learning practice, it can have high computational complexity.
Our paper shows that such complexity can significantly affect the overall convergence performance of ANIL.
arXiv Detail & Related papers (2020-06-16T19:57:48Z) - On the Convergence Theory of Debiased Model-Agnostic Meta-Reinforcement
Learning [25.163423936635787]
We consider Model-Agnostic Meta-Learning (MAML) methods for Reinforcement Learning (RL) problems.
We propose a variant of the MAML method, named Gradient Meta-Reinforcement Learning (SG-MRL)
We derive the iteration and sample complexity of SG-MRL to find an $ilon$-first-order stationary point, which, to the best of our knowledge, provides the first convergence guarantee for model-agnostic meta-reinforcement learning algorithms.
arXiv Detail & Related papers (2020-02-12T18:29:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.