B-SMALL: A Bayesian Neural Network approach to Sparse Model-Agnostic
Meta-Learning
- URL: http://arxiv.org/abs/2101.00203v1
- Date: Fri, 1 Jan 2021 09:19:48 GMT
- Title: B-SMALL: A Bayesian Neural Network approach to Sparse Model-Agnostic
Meta-Learning
- Authors: Anish Madan, Ranjitha Prasad
- Abstract summary: We propose a Bayesian neural network based MAML algorithm, which we refer to as the B-SMALL algorithm.
We demonstrate the performance of B-MAML using classification and regression tasks, and highlight that training a sparsifying BNN using MAML indeed improves the parameter footprint of the model.
- Score: 2.9189409618561966
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: There is a growing interest in the learning-to-learn paradigm, also known as
meta-learning, where models infer on new tasks using a few training examples.
Recently, meta-learning based methods have been widely used in few-shot
classification, regression, reinforcement learning, and domain adaptation. The
model-agnostic meta-learning (MAML) algorithm is a well-known algorithm that
obtains model parameter initialization at meta-training phase. In the meta-test
phase, this initialization is rapidly adapted to new tasks by using gradient
descent. However, meta-learning models are prone to overfitting since there are
insufficient training tasks resulting in over-parameterized models with poor
generalization performance for unseen tasks. In this paper, we propose a
Bayesian neural network based MAML algorithm, which we refer to as the B-SMALL
algorithm. The proposed framework incorporates a sparse variational loss term
alongside the loss function of MAML, which uses a sparsifying approximated KL
divergence as a regularizer. We demonstrate the performance of B-MAML using
classification and regression tasks, and highlight that training a sparsifying
BNN using MAML indeed improves the parameter footprint of the model while
performing at par or even outperforming the MAML approach. We also illustrate
applicability of our approach in distributed sensor networks, where sparsity
and meta-learning can be beneficial.
Related papers
- Model-Agnostic Multitask Fine-tuning for Few-shot Vision-Language
Transfer Learning [59.38343286807997]
We propose Model-Agnostic Multitask Fine-tuning (MAMF) for vision-language models on unseen tasks.
Compared with model-agnostic meta-learning (MAML), MAMF discards the bi-level optimization and uses only first-order gradients.
We show that MAMF consistently outperforms the classical fine-tuning method for few-shot transfer learning on five benchmark datasets.
arXiv Detail & Related papers (2022-03-09T17:26:53Z) - Contextual Gradient Scaling for Few-Shot Learning [24.19934081878197]
We propose contextual gradient scaling (CxGrad) for model-agnostic meta-learning (MAML)
CxGrad scales gradient norms of the backbone to facilitate learning task-specific knowledge in the inner-loop.
Experimental results show that CxGrad effectively encourages the backbone to learn task-specific knowledge in the inner-loop.
arXiv Detail & Related papers (2021-10-20T03:05:58Z) - MAML is a Noisy Contrastive Learner [72.04430033118426]
Model-agnostic meta-learning (MAML) is one of the most popular and widely-adopted meta-learning algorithms nowadays.
We provide a new perspective to the working mechanism of MAML and discover that: MAML is analogous to a meta-learner using a supervised contrastive objective function.
We propose a simple but effective technique, zeroing trick, to alleviate such interference.
arXiv Detail & Related papers (2021-06-29T12:52:26Z) - Memory-Based Optimization Methods for Model-Agnostic Meta-Learning and
Personalized Federated Learning [56.17603785248675]
Model-agnostic meta-learning (MAML) has become a popular research area.
Existing MAML algorithms rely on the episode' idea by sampling a few tasks and data points to update the meta-model at each iteration.
This paper proposes memory-based algorithms for MAML that converge with vanishing error.
arXiv Detail & Related papers (2021-06-09T08:47:58Z) - Energy-Efficient and Federated Meta-Learning via Projected Stochastic
Gradient Ascent [79.58680275615752]
We propose an energy-efficient federated meta-learning framework.
We assume each task is owned by a separate agent, so a limited number of tasks is used to train a meta-model.
arXiv Detail & Related papers (2021-05-31T08:15:44Z) - Repurposing Pretrained Models for Robust Out-of-domain Few-Shot Learning [23.135033752967598]
We consider the novel problem of repurposing pretrained MAML checkpoints to solve new few-shot classification tasks.
Because of the potential distribution mismatch, the original MAML steps may no longer be optimal.
We propose an alternative metatesting procedure and combine adversarial training and uncertainty-based stepsize adaptation.
arXiv Detail & Related papers (2021-03-16T12:53:09Z) - Robust MAML: Prioritization task buffer with adaptive learning process
for model-agnostic meta-learning [15.894925018423665]
Model agnostic meta-learning (MAML) is a popular state-of-the-art meta-learning algorithm.
This paper proposes a more robust MAML based on an adaptive learning scheme and a prioritization task buffer.
Experimental results on meta reinforcement learning environments demonstrate a substantial performance gain.
arXiv Detail & Related papers (2021-03-15T09:34:34Z) - On Fast Adversarial Robustness Adaptation in Model-Agnostic
Meta-Learning [100.14809391594109]
Model-agnostic meta-learning (MAML) has emerged as one of the most successful meta-learning techniques in few-shot learning.
Despite the generalization power of the meta-model, it remains elusive that how adversarial robustness can be maintained by MAML in few-shot learning.
We propose a general but easily-optimized robustness-regularized meta-learning framework, which allows the use of unlabeled data augmentation, fast adversarial attack generation, and computationally-light fine-tuning.
arXiv Detail & Related papers (2021-02-20T22:03:04Z) - A Nested Bi-level Optimization Framework for Robust Few Shot Learning [10.147225934340877]
NestedMAML learns to assign weights to training tasks or instances.
Experiments on synthetic and real-world datasets demonstrate that NestedMAML efficiently mitigates the effects of "unwanted" tasks or instances.
arXiv Detail & Related papers (2020-11-13T06:41:22Z) - La-MAML: Look-ahead Meta Learning for Continual Learning [14.405620521842621]
We propose Look-ahead MAML (La-MAML), a fast optimisation-based meta-learning algorithm for online-continual learning, aided by a small episodic memory.
La-MAML achieves performance superior to other replay-based, prior-based and meta-learning based approaches for continual learning on real-world visual classification benchmarks.
arXiv Detail & Related papers (2020-07-27T23:07:01Z) - Theoretical Convergence of Multi-Step Model-Agnostic Meta-Learning [63.64636047748605]
We develop a new theoretical framework to provide convergence guarantee for the general multi-step MAML algorithm.
In particular, our results suggest that an inner-stage step needs to be chosen inversely proportional to $N$ of inner-stage steps in order for $N$ MAML to have guaranteed convergence.
arXiv Detail & Related papers (2020-02-18T19:17:54Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.