Memory-Based Meta-Learning on Non-Stationary Distributions
- URL: http://arxiv.org/abs/2302.03067v2
- Date: Thu, 25 May 2023 17:53:32 GMT
- Title: Memory-Based Meta-Learning on Non-Stationary Distributions
- Authors: Tim Genewein, Gr\'egoire Del\'etang, Anian Ruoss, Li Kevin Wenliang,
Elliot Catt, Vincent Dutordoir, Jordi Grau-Moya, Laurent Orseau, Marcus
Hutter, Joel Veness
- Abstract summary: Memory-based meta-learning is a technique for approximating Bayes-optimal predictors.
We show that memory-based neural models, including Transformers, LSTMs, and RNNs can learn to accurately approximate known Bayes-optimal algorithms.
- Score: 29.443692147512742
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Memory-based meta-learning is a technique for approximating Bayes-optimal
predictors. Under fairly general conditions, minimizing sequential prediction
error, measured by the log loss, leads to implicit meta-learning. The goal of
this work is to investigate how far this interpretation can be realized by
current sequence prediction models and training regimes. The focus is on
piecewise stationary sources with unobserved switching-points, which arguably
capture an important characteristic of natural language and action-observation
sequences in partially observable environments. We show that various types of
memory-based neural models, including Transformers, LSTMs, and RNNs can learn
to accurately approximate known Bayes-optimal algorithms and behave as if
performing Bayesian inference over the latent switching-points and the latent
parameters governing the data distribution within each segment.
Related papers
- Amortised Inference in Neural Networks for Small-Scale Probabilistic
Meta-Learning [41.85464593920907]
A global inducing point variational approximation for BNNs is based on using a set of inducing inputs to construct a series of conditional distributions.
Our key insight is that these inducing inputs can be replaced by the actual data, such that the variational distribution consists of a set of approximate likelihoods for each datapoint.
By training this inference network across related datasets, we can meta-learn Bayesian inference over task-specific BNNs.
arXiv Detail & Related papers (2023-10-24T12:34:25Z) - MARS: Meta-Learning as Score Matching in the Function Space [79.73213540203389]
We present a novel approach to extracting inductive biases from a set of related datasets.
We use functional Bayesian neural network inference, which views the prior as a process and performs inference in the function space.
Our approach can seamlessly acquire and represent complex prior knowledge by metalearning the score function of the data-generating process.
arXiv Detail & Related papers (2022-10-24T15:14:26Z) - Correcting Model Bias with Sparse Implicit Processes [0.9187159782788579]
We show that Sparse Implicit Processes (SIP) is capable of correcting model bias when the data generating mechanism differs strongly from the one implied by the model.
We use synthetic datasets to show that SIP is capable of providing predictive distributions that reflect the data better than the exact predictions of the initial, but wrongly assumed model.
arXiv Detail & Related papers (2022-07-21T18:00:01Z) - Transformers Can Do Bayesian Inference [56.99390658880008]
We present Prior-Data Fitted Networks (PFNs)
PFNs leverage in-context learning in large-scale machine learning techniques to approximate a large set of posteriors.
We demonstrate that PFNs can near-perfectly mimic Gaussian processes and also enable efficient Bayesian inference for intractable problems.
arXiv Detail & Related papers (2021-12-20T13:07:39Z) - False Positive Detection and Prediction Quality Estimation for LiDAR
Point Cloud Segmentation [5.735035463793009]
We present a novel post-processing tool for semantic segmentation of LiDAR point cloud data, called LidarMetaSeg.
We compute dispersion measures based on network probability outputs as well as feature measures based on point cloud input features and aggregate them on segment level.
These aggregated measures are used to train a meta classification model to predict whether a predicted segment is a false positive or not and a meta regression model to predict the segmentwise intersection over union.
arXiv Detail & Related papers (2021-10-29T11:00:30Z) - Bayesian Meta-Learning Through Variational Gaussian Processes [0.0]
We extend Gaussian-process-based meta-learning to allow for high-quality, arbitrary non-Gaussian uncertainty predictions.
Our method performs significantly better than existing Bayesian meta-learning baselines.
arXiv Detail & Related papers (2021-10-21T10:44:23Z) - Meta Learning Low Rank Covariance Factors for Energy-Based Deterministic
Uncertainty [58.144520501201995]
Bi-Lipschitz regularization of neural network layers preserve relative distances between data instances in the feature spaces of each layer.
With the use of an attentive set encoder, we propose to meta learn either diagonal or diagonal plus low-rank factors to efficiently construct task specific covariance matrices.
We also propose an inference procedure which utilizes scaled energy to achieve a final predictive distribution.
arXiv Detail & Related papers (2021-10-12T22:04:19Z) - Unlabelled Data Improves Bayesian Uncertainty Calibration under
Covariate Shift [100.52588638477862]
We develop an approximate Bayesian inference scheme based on posterior regularisation.
We demonstrate the utility of our method in the context of transferring prognostic models of prostate cancer across globally diverse populations.
arXiv Detail & Related papers (2020-06-26T13:50:19Z) - Meta-Learned Confidence for Few-shot Learning [60.6086305523402]
A popular transductive inference technique for few-shot metric-based approaches, is to update the prototype of each class with the mean of the most confident query examples.
We propose to meta-learn the confidence for each query sample, to assign optimal weights to unlabeled queries.
We validate our few-shot learning model with meta-learned confidence on four benchmark datasets.
arXiv Detail & Related papers (2020-02-27T10:22:17Z) - Meta-learning framework with applications to zero-shot time-series
forecasting [82.61728230984099]
This work provides positive evidence using a broad meta-learning framework.
residual connections act as a meta-learning adaptation mechanism.
We show that it is viable to train a neural network on a source TS dataset and deploy it on a different target TS dataset without retraining.
arXiv Detail & Related papers (2020-02-07T16:39:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.