Hypernetwork approach to Bayesian MAML
- URL: http://arxiv.org/abs/2210.02796v2
- Date: Wed, 30 Aug 2023 21:35:48 GMT
- Title: Hypernetwork approach to Bayesian MAML
- Authors: Piotr Borycki, Piotr Kubacki, Marcin Przewi\k{e}\'zlikowski, Tomasz
Ku\'smierczyk, Jacek Tabor, Przemys{\l}aw Spurek
- Abstract summary: We propose a novel framework for Bayesian MAML called BayesianHMAML.
It learns the universal weights point-wise, but a probabilistic structure is added when adapted for specific tasks.
- Score: 13.012692001087617
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The main goal of Few-Shot learning algorithms is to enable learning from
small amounts of data. One of the most popular and elegant Few-Shot learning
approaches is Model-Agnostic Meta-Learning (MAML). The main idea behind this
method is to learn the shared universal weights of a meta-model, which are then
adapted for specific tasks. However, the method suffers from over-fitting and
poorly quantifies uncertainty due to limited data size. Bayesian approaches
could, in principle, alleviate these shortcomings by learning weight
distributions in place of point-wise weights. Unfortunately, previous
modifications of MAML are limited due to the simplicity of Gaussian posteriors,
MAML-like gradient-based weight updates, or by the same structure enforced for
universal and adapted weights.
In this paper, we propose a novel framework for Bayesian MAML called
BayesianHMAML, which employs Hypernetworks for weight updates. It learns the
universal weights point-wise, but a probabilistic structure is added when
adapted for specific tasks. In such a framework, we can use simple Gaussian
distributions or more complicated posteriors induced by Continuous Normalizing
Flows.
Related papers
- Scaling Forward Gradient With Local Losses [117.22685584919756]
Forward learning is a biologically plausible alternative to backprop for learning deep neural networks.
We show that it is possible to substantially reduce the variance of the forward gradient by applying perturbations to activations rather than weights.
Our approach matches backprop on MNIST and CIFAR-10 and significantly outperforms previously proposed backprop-free algorithms on ImageNet.
arXiv Detail & Related papers (2022-10-07T03:52:27Z) - Content Popularity Prediction Based on Quantized Federated Bayesian
Learning in Fog Radio Access Networks [76.16527095195893]
We investigate the content popularity prediction problem in cache-enabled fog radio access networks (F-RANs)
In order to predict the content popularity with high accuracy and low complexity, we propose a Gaussian process based regressor to model the content request pattern.
We utilize Bayesian learning to train the model parameters, which is robust to overfitting.
arXiv Detail & Related papers (2022-06-23T03:05:12Z) - HyperMAML: Few-Shot Adaptation of Deep Models with Hypernetworks [0.0]
Few-Shot learning aims to train models which can easily adapt to previously unseen tasks.
Model-Agnostic Meta-Learning (MAML) is one of the most popular Few-Shot learning approaches.
In this paper, we propose HyperMAML, where the training of the update procedure is also part of the model.
arXiv Detail & Related papers (2022-05-31T12:31:21Z) - Memory-Based Optimization Methods for Model-Agnostic Meta-Learning and
Personalized Federated Learning [56.17603785248675]
Model-agnostic meta-learning (MAML) has become a popular research area.
Existing MAML algorithms rely on the episode' idea by sampling a few tasks and data points to update the meta-model at each iteration.
This paper proposes memory-based algorithms for MAML that converge with vanishing error.
arXiv Detail & Related papers (2021-06-09T08:47:58Z) - Repurposing Pretrained Models for Robust Out-of-domain Few-Shot Learning [23.135033752967598]
We consider the novel problem of repurposing pretrained MAML checkpoints to solve new few-shot classification tasks.
Because of the potential distribution mismatch, the original MAML steps may no longer be optimal.
We propose an alternative metatesting procedure and combine adversarial training and uncertainty-based stepsize adaptation.
arXiv Detail & Related papers (2021-03-16T12:53:09Z) - B-SMALL: A Bayesian Neural Network approach to Sparse Model-Agnostic
Meta-Learning [2.9189409618561966]
We propose a Bayesian neural network based MAML algorithm, which we refer to as the B-SMALL algorithm.
We demonstrate the performance of B-MAML using classification and regression tasks, and highlight that training a sparsifying BNN using MAML indeed improves the parameter footprint of the model.
arXiv Detail & Related papers (2021-01-01T09:19:48Z) - Meta-Generating Deep Attentive Metric for Few-shot Classification [53.07108067253006]
We present a novel deep metric meta-generation method to generate a specific metric for a new few-shot learning task.
In this study, we structure the metric using a three-layer deep attentive network that is flexible enough to produce a discriminative metric for each task.
We gain surprisingly obvious performance improvement over state-of-the-art competitors, especially in the challenging cases.
arXiv Detail & Related papers (2020-12-03T02:07:43Z) - Meta-Learning with Adaptive Hyperparameters [55.182841228303225]
We focus on a complementary factor in MAML framework, inner-loop optimization (or fast adaptation)
We propose a new weight update rule that greatly enhances the fast adaptation process.
arXiv Detail & Related papers (2020-10-31T08:05:34Z) - Bayesian Deep Learning via Subnetwork Inference [2.2835610890984164]
We show that it suffices to perform inference over a small subset of model weights in order to obtain accurate predictive posteriors.
This subnetwork inference framework enables us to use expressive, otherwise intractable, posterior approximations over such subsets.
arXiv Detail & Related papers (2020-10-28T01:10:11Z) - Weighted Meta-Learning [21.522768804834616]
Many popular meta-learning algorithms, such as model-agnostic meta-learning (MAML), only assume access to the target samples for fine-tuning.
In this work, we provide a general framework for meta-learning based on weighting the loss of different source tasks.
We develop a learning algorithm based on minimizing the error bound with respect to an empirical IPM, including a weighted MAML algorithm.
arXiv Detail & Related papers (2020-03-20T19:00:42Z) - Meta Cyclical Annealing Schedule: A Simple Approach to Avoiding
Meta-Amortization Error [50.83356836818667]
We develop a novel meta-regularization objective using it cyclical annealing schedule and it maximum mean discrepancy (MMD) criterion.
The experimental results show that our approach substantially outperforms standard meta-learning algorithms.
arXiv Detail & Related papers (2020-03-04T04:43:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.