Related papers: Memory-Based Optimization Methods for Model-Agnostic Meta-Learning and Personalized Federated Learning

Memory-Based Optimization Methods for Model-Agnostic Meta-Learning and Personalized Federated Learning

URL: http://arxiv.org/abs/2106.04911v4
Date: Mon, 24 Apr 2023 20:21:31 GMT
Title: Memory-Based Optimization Methods for Model-Agnostic Meta-Learning and Personalized Federated Learning
Authors: Bokun Wang, Zhuoning Yuan, Yiming Ying, Tianbao Yang
Abstract summary: Model-agnostic meta-learning (MAML) has become a popular research area. Existing MAML algorithms rely on the episode' idea by sampling a few tasks and data points to update the meta-model at each iteration. This paper proposes memory-based algorithms for MAML that converge with vanishing error.
Score: 56.17603785248675
License: http://creativecommons.org/licenses/by/4.0/
Abstract: In recent years, model-agnostic meta-learning (MAML) has become a popular research area. However, the stochastic optimization of MAML is still underdeveloped. Existing MAML algorithms rely on the ``episode'' idea by sampling a few tasks and data points to update the meta-model at each iteration. Nonetheless, these algorithms either fail to guarantee convergence with a constant mini-batch size or require processing a large number of tasks at every iteration, which is unsuitable for continual learning or cross-device federated learning where only a small number of tasks are available per iteration or per round. To address these issues, this paper proposes memory-based stochastic algorithms for MAML that converge with vanishing error. The proposed algorithms require sampling a constant number of tasks and data samples per iteration, making them suitable for the continual learning scenario. Moreover, we introduce a communication-efficient memory-based MAML algorithm for personalized federated learning in cross-device (with client sampling) and cross-silo (without client sampling) settings. Our theoretical analysis improves the optimization theory for MAML, and our empirical results corroborate our theoretical findings. Interested readers can access our code at \url{https://github.com/bokun-wang/moml}.

Related papers

Fast Adaptation with Kernel and Gradient based Meta Leaning [4.763682200721131]
We propose two algorithms to improve both the inner and outer loops of Model A Meta Learning (MAML) Our first algorithm redefines the optimization problem in the function space to update the model using closed-form solutions. In the outer loop, the second algorithm adjusts the learning of the meta-learner by assigning weights to the losses from each task of the inner loop.
arXiv Detail & Related papers (2024-11-01T07:05:03Z)
MAML is a Noisy Contrastive Learner [72.04430033118426]
Model-agnostic meta-learning (MAML) is one of the most popular and widely-adopted meta-learning algorithms nowadays. We provide a new perspective to the working mechanism of MAML and discover that: MAML is analogous to a meta-learner using a supervised contrastive objective function. We propose a simple but effective technique, zeroing trick, to alleviate such interference.
arXiv Detail & Related papers (2021-06-29T12:52:26Z)
Energy-Efficient and Federated Meta-Learning via Projected Stochastic Gradient Ascent [79.58680275615752]
We propose an energy-efficient federated meta-learning framework. We assume each task is owned by a separate agent, so a limited number of tasks is used to train a meta-model.
arXiv Detail & Related papers (2021-05-31T08:15:44Z)
Sample Efficient Linear Meta-Learning by Alternating Minimization [74.40553081646995]
We study a simple alternating minimization method (MLLAM) which alternately learns the low-dimensional subspace and the regressors. We show that for a constant subspace dimension MLLAM obtains nearly-optimal estimation error, despite requiring only $Omega(log d)$ samples per task. We propose a novel task subset selection scheme that ensures the same strong statistical guarantee as MLLAM.
arXiv Detail & Related papers (2021-05-18T06:46:48Z)
B-SMALL: A Bayesian Neural Network approach to Sparse Model-Agnostic Meta-Learning [2.9189409618561966]
We propose a Bayesian neural network based MAML algorithm, which we refer to as the B-SMALL algorithm. We demonstrate the performance of B-MAML using classification and regression tasks, and highlight that training a sparsifying BNN using MAML indeed improves the parameter footprint of the model.
arXiv Detail & Related papers (2021-01-01T09:19:48Z)
A Nested Bi-level Optimization Framework for Robust Few Shot Learning [10.147225934340877]
NestedMAML learns to assign weights to training tasks or instances. Experiments on synthetic and real-world datasets demonstrate that NestedMAML efficiently mitigates the effects of "unwanted" tasks or instances.
arXiv Detail & Related papers (2020-11-13T06:41:22Z)
Dif-MAML: Decentralized Multi-Agent Meta-Learning [54.39661018886268]
We propose a cooperative multi-agent meta-learning algorithm, referred to as MAML or Dif-MAML. We show that the proposed strategy allows a collection of agents to attain agreement at a linear rate and to converge to a stationary point of the aggregate MAML. Simulation results illustrate the theoretical findings and the superior performance relative to the traditional non-cooperative setting.
arXiv Detail & Related papers (2020-10-06T16:51:09Z)
Weighted Meta-Learning [21.522768804834616]
Many popular meta-learning algorithms, such as model-agnostic meta-learning (MAML), only assume access to the target samples for fine-tuning. In this work, we provide a general framework for meta-learning based on weighting the loss of different source tasks. We develop a learning algorithm based on minimizing the error bound with respect to an empirical IPM, including a weighted MAML algorithm.
arXiv Detail & Related papers (2020-03-20T19:00:42Z)
Theoretical Convergence of Multi-Step Model-Agnostic Meta-Learning [63.64636047748605]
We develop a new theoretical framework to provide convergence guarantee for the general multi-step MAML algorithm. In particular, our results suggest that an inner-stage step needs to be chosen inversely proportional to $N$ of inner-stage steps in order for $N$ MAML to have guaranteed convergence.
arXiv Detail & Related papers (2020-02-18T19:17:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.