Convergence of First-Order Algorithms for Meta-Learning with Moreau
Envelopes
- URL: http://arxiv.org/abs/2301.06806v1
- Date: Tue, 17 Jan 2023 11:04:10 GMT
- Title: Convergence of First-Order Algorithms for Meta-Learning with Moreau
Envelopes
- Authors: Konstantin Mishchenko, Slavom\'ir Hanzely, Peter Richt\'arik
- Abstract summary: We show the convergence of First-Order Model-Agnostic Meta-Learning (FO-MAML) to the vicinity of a solution of objective Moreau.
Our main theoretical achievement is a theoretical improvement upon the inexact SGD framework.
- Score: 6.935471115003109
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: In this work, we consider the problem of minimizing the sum of Moreau
envelopes of given functions, which has previously appeared in the context of
meta-learning and personalized federated learning. In contrast to the existing
theory that requires running subsolvers until a certain precision is reached,
we only assume that a finite number of gradient steps is taken at each
iteration. As a special case, our theory allows us to show the convergence of
First-Order Model-Agnostic Meta-Learning (FO-MAML) to the vicinity of a
solution of Moreau objective. We also study a more general family of
first-order algorithms that can be viewed as a generalization of FO-MAML. Our
main theoretical achievement is a theoretical improvement upon the inexact SGD
framework. In particular, our perturbed-iterate analysis allows for tighter
guarantees that improve the dependency on the problem's conditioning. In
contrast to the related work on meta-learning, ours does not require any
assumptions on the Hessian smoothness, and can leverage smoothness and
convexity of the reformulation based on Moreau envelopes. Furthermore, to fill
the gaps in the comparison of FO-MAML to the Implicit MAML (iMAML), we show
that the objective of iMAML is neither smooth nor convex, implying that it has
no convergence guarantees based on the existing theory.
Related papers
- Performative Reinforcement Learning with Linear Markov Decision Process [14.75815792682734]
We study the setting of emphperformative reinforcement learning where the deployed policy affects both the reward and the transition of the underlying Markov decision process.
We generalize the results to emphlinear Markov decision processes which is the primary theoretical model of large-scale MDPs.
arXiv Detail & Related papers (2024-11-07T23:04:48Z) - A New First-Order Meta-Learning Algorithm with Convergence Guarantees [37.85411810113886]
Gradient-based meta-learning, especially MAML, has emerged as a viable solution to accomplish this goal.
One problem MAML encounters is its computational and memory burdens needed to compute the meta-gradients.
We propose a new first-order variant of MAML that we prove converges to a stationary point of the MAML objective, unlike other first-order variants.
arXiv Detail & Related papers (2024-09-05T16:37:26Z) - Moreau Envelope ADMM for Decentralized Weakly Convex Optimization [55.2289666758254]
This paper proposes a proximal variant of the alternating direction method of multipliers (ADMM) for distributed optimization.
The results of our numerical experiments indicate that our method is faster and more robust than widely-used approaches.
arXiv Detail & Related papers (2023-08-31T14:16:30Z) - MAML is a Noisy Contrastive Learner [72.04430033118426]
Model-agnostic meta-learning (MAML) is one of the most popular and widely-adopted meta-learning algorithms nowadays.
We provide a new perspective to the working mechanism of MAML and discover that: MAML is analogous to a meta-learner using a supervised contrastive objective function.
We propose a simple but effective technique, zeroing trick, to alleviate such interference.
arXiv Detail & Related papers (2021-06-29T12:52:26Z) - Memory-Based Optimization Methods for Model-Agnostic Meta-Learning and
Personalized Federated Learning [56.17603785248675]
Model-agnostic meta-learning (MAML) has become a popular research area.
Existing MAML algorithms rely on the episode' idea by sampling a few tasks and data points to update the meta-model at each iteration.
This paper proposes memory-based algorithms for MAML that converge with vanishing error.
arXiv Detail & Related papers (2021-06-09T08:47:58Z) - Meta-Learning with Neural Tangent Kernels [58.06951624702086]
We propose the first meta-learning paradigm in the Reproducing Kernel Hilbert Space (RKHS) induced by the meta-model's Neural Tangent Kernel (NTK)
Within this paradigm, we introduce two meta-learning algorithms, which no longer need a sub-optimal iterative inner-loop adaptation as in the MAML framework.
We achieve this goal by 1) replacing the adaptation with a fast-adaptive regularizer in the RKHS; and 2) solving the adaptation analytically based on the NTK theory.
arXiv Detail & Related papers (2021-02-07T20:53:23Z) - On the Global Optimality of Model-Agnostic Meta-Learning [133.16370011229776]
Model-a meta-learning (MAML) formulates meta-learning as a bilevel optimization problem, where the inner level solves each subtask based on a shared prior.
We characterize optimality of the stationary points attained by MAML for both learning and supervised learning, where the inner-level outer-level problems are solved via first-order optimization methods.
arXiv Detail & Related papers (2020-06-23T17:33:14Z) - Theoretical Convergence of Multi-Step Model-Agnostic Meta-Learning [63.64636047748605]
We develop a new theoretical framework to provide convergence guarantee for the general multi-step MAML algorithm.
In particular, our results suggest that an inner-stage step needs to be chosen inversely proportional to $N$ of inner-stage steps in order for $N$ MAML to have guaranteed convergence.
arXiv Detail & Related papers (2020-02-18T19:17:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.