Meta Learning as Bayes Risk Minimization
- URL: http://arxiv.org/abs/2006.01488v1
- Date: Tue, 2 Jun 2020 09:38:00 GMT
- Title: Meta Learning as Bayes Risk Minimization
- Authors: Shin-ichi Maeda, Toshiki Nakanishi, Masanori Koyama
- Abstract summary: We use a probabilistic framework to formalize what it means for two tasks to be related.
In our formulation, the BRM optimal solution is given by the predictive distribution computed from the posterior distribution of the task-specific latent variable conditioned on the contextual dataset.
We show that our approximation of the posterior distributions converges to the maximum likelihood estimate with the same rate as the true posterior distribution.
- Score: 18.76745359031975
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Meta-Learning is a family of methods that use a set of interrelated tasks to
learn a model that can quickly learn a new query task from a possibly small
contextual dataset. In this study, we use a probabilistic framework to
formalize what it means for two tasks to be related and reframe the
meta-learning problem into the problem of Bayesian risk minimization (BRM). In
our formulation, the BRM optimal solution is given by the predictive
distribution computed from the posterior distribution of the task-specific
latent variable conditioned on the contextual dataset, and this justifies the
philosophy of Neural Process. However, the posterior distribution in Neural
Process violates the way the posterior distribution changes with the contextual
dataset. To address this problem, we present a novel Gaussian approximation for
the posterior distribution that generalizes the posterior of the linear
Gaussian model. Unlike that of the Neural Process, our approximation of the
posterior distributions converges to the maximum likelihood estimate with the
same rate as the true posterior distribution. We also demonstrate the
competitiveness of our approach on benchmark datasets.
Related papers
- On Discriminative Probabilistic Modeling for Self-Supervised Representation Learning [85.75164588939185]
We study the discriminative probabilistic modeling problem on a continuous domain for (multimodal) self-supervised representation learning.
We conduct generalization error analysis to reveal the limitation of current InfoNCE-based contrastive loss for self-supervised representation learning.
arXiv Detail & Related papers (2024-10-11T18:02:46Z) - Rejection via Learning Density Ratios [50.91522897152437]
Classification with rejection emerges as a learning paradigm which allows models to abstain from making predictions.
We propose a different distributional perspective, where we seek to find an idealized data distribution which maximizes a pretrained model's performance.
Our framework is tested empirically over clean and noisy datasets.
arXiv Detail & Related papers (2024-05-29T01:32:17Z) - Collaborative Heterogeneous Causal Inference Beyond Meta-analysis [68.4474531911361]
We propose a collaborative inverse propensity score estimator for causal inference with heterogeneous data.
Our method shows significant improvements over the methods based on meta-analysis when heterogeneity increases.
arXiv Detail & Related papers (2024-04-24T09:04:36Z) - SPDE priors for uncertainty quantification of end-to-end neural data
assimilation schemes [4.213142548113385]
Recent advances in the deep learning community enables to adress this problem as neural architecture embedding data assimilation variational framework.
In this work, we draw from SPDE-based Processes to estimate prior models able to handle non-stationary covariances in both space and time.
Our neural variational scheme is modified to embed an augmented state formulation with both state SPDE parametrization to estimate.
arXiv Detail & Related papers (2024-02-02T19:18:12Z) - Out of the Ordinary: Spectrally Adapting Regression for Covariate Shift [12.770658031721435]
We propose a method for adapting the weights of the last layer of a pre-trained neural regression model to perform better on input data originating from a different distribution.
We demonstrate how this lightweight spectral adaptation procedure can improve out-of-distribution performance for synthetic and real-world datasets.
arXiv Detail & Related papers (2023-12-29T04:15:58Z) - Variational Density Propagation Continual Learning [0.0]
Deep Neural Networks (DNNs) deployed to the real world are regularly subject to out-of-distribution (OoD) data.
This paper proposes a framework for adapting to data distribution drift modeled by benchmark Continual Learning datasets.
arXiv Detail & Related papers (2023-08-22T21:51:39Z) - Introduction To Gaussian Process Regression In Bayesian Inverse
Problems, With New ResultsOn Experimental Design For Weighted Error Measures [0.0]
This work serves as an introduction to Gaussian process regression, in particular in the context of building surrogate models for inverse problems.
We show that the error between the true and approximate posterior distribution can be bounded by the error between the true and approximate likelihood, measured in the $L2$-norm weighted by the true posterior.
arXiv Detail & Related papers (2023-02-09T09:25:39Z) - Variational Laplace Autoencoders [53.08170674326728]
Variational autoencoders employ an amortized inference model to approximate the posterior of latent variables.
We present a novel approach that addresses the limited posterior expressiveness of fully-factorized Gaussian assumption.
We also present a general framework named Variational Laplace Autoencoders (VLAEs) for training deep generative models.
arXiv Detail & Related papers (2022-11-30T18:59:27Z) - MINIMALIST: Mutual INformatIon Maximization for Amortized Likelihood
Inference from Sampled Trajectories [61.3299263929289]
Simulation-based inference enables learning the parameters of a model even when its likelihood cannot be computed in practice.
One class of methods uses data simulated with different parameters to infer an amortized estimator for the likelihood-to-evidence ratio.
We show that this approach can be formulated in terms of mutual information between model parameters and simulated data.
arXiv Detail & Related papers (2021-06-03T12:59:16Z) - A Distribution-Dependent Analysis of Meta-Learning [13.24264919706183]
Key problem in the theory of meta-learning is to understand how the task distributions influence transfer risk.
In this paper, we give distribution-dependent lower bounds on the transfer risk of any algorithm.
We show that a novel, weighted version of the so-called biased regularized regression method is able to match these lower bounds up to a fixed constant factor.
arXiv Detail & Related papers (2020-10-31T19:36:15Z) - Model Fusion with Kullback--Leibler Divergence [58.20269014662046]
We propose a method to fuse posterior distributions learned from heterogeneous datasets.
Our algorithm relies on a mean field assumption for both the fused model and the individual dataset posteriors.
arXiv Detail & Related papers (2020-07-13T03:27:45Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.