Scalable PAC-Bayesian Meta-Learning via the PAC-Optimal Hyper-Posterior:
From Theory to Practice
- URL: http://arxiv.org/abs/2211.07206v3
- Date: Fri, 22 Dec 2023 16:30:43 GMT
- Title: Scalable PAC-Bayesian Meta-Learning via the PAC-Optimal Hyper-Posterior:
From Theory to Practice
- Authors: Jonas Rothfuss, Martin Josifoski, Vincent Fortuin, Andreas Krause
- Abstract summary: A central question in the meta-learning literature is how to regularize to ensure generalization to unseen tasks.
We present a generalization bound for meta-learning, which was first derived by Rothfuss et al.
We provide a theoretical analysis and empirical case study under which conditions and to what extent these guarantees for meta-learning improve upon PAC-Bayesian per-task learning bounds.
- Score: 54.03076395748459
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Meta-Learning aims to speed up the learning process on new tasks by acquiring
useful inductive biases from datasets of related learning tasks. While, in
practice, the number of related tasks available is often small, most of the
existing approaches assume an abundance of tasks; making them unrealistic and
prone to overfitting. A central question in the meta-learning literature is how
to regularize to ensure generalization to unseen tasks. In this work, we
provide a theoretical analysis using the PAC-Bayesian theory and present a
generalization bound for meta-learning, which was first derived by Rothfuss et
al. (2021a). Crucially, the bound allows us to derive the closed form of the
optimal hyper-posterior, referred to as PACOH, which leads to the best
performance guarantees. We provide a theoretical analysis and empirical case
study under which conditions and to what extent these guarantees for
meta-learning improve upon PAC-Bayesian per-task learning bounds. The
closed-form PACOH inspires a practical meta-learning approach that avoids the
reliance on bi-level optimization, giving rise to a stochastic optimization
problem that is amenable to standard variational methods that scale well. Our
experiments show that, when instantiating the PACOH with Gaussian processes and
Bayesian Neural Networks models, the resulting methods are more scalable, and
yield state-of-the-art performance, both in terms of predictive accuracy and
the quality of uncertainty estimates.
Related papers
- Learning-to-Optimize with PAC-Bayesian Guarantees: Theoretical Considerations and Practical Implementation [4.239829789304117]
We use the PAC-Bayesian theory for the setting of learning-to-optimize.
We present the first framework to learn optimization algorithms with provable generalization guarantees.
Our learned algorithms provably outperform related ones derived from a (deterministic) worst-case analysis.
arXiv Detail & Related papers (2024-04-04T08:24:57Z) - MARS: Meta-Learning as Score Matching in the Function Space [79.73213540203389]
We present a novel approach to extracting inductive biases from a set of related datasets.
We use functional Bayesian neural network inference, which views the prior as a process and performs inference in the function space.
Our approach can seamlessly acquire and represent complex prior knowledge by metalearning the score function of the data-generating process.
arXiv Detail & Related papers (2022-10-24T15:14:26Z) - Meta-learning Feature Representations for Adaptive Gaussian Processes
via Implicit Differentiation [1.5293427903448025]
We propose a general framework for learning deep kernels by interpolating between meta-learning and conventional learning.
Although ADKF is a completely general method, we argue that it is especially well-suited for drug discovery problems.
arXiv Detail & Related papers (2022-05-05T15:26:53Z) - Learning MDPs from Features: Predict-Then-Optimize for Sequential
Decision Problems by Reinforcement Learning [52.74071439183113]
We study the predict-then-optimize framework in the context of sequential decision problems (formulated as MDPs) solved via reinforcement learning.
Two significant computational challenges arise in applying decision-focused learning to MDPs.
arXiv Detail & Related papers (2021-06-06T23:53:31Z) - Bridging the Gap Between Practice and PAC-Bayes Theory in Few-Shot
Meta-Learning [20.911545126223405]
We develop two PAC-Bayesian bounds tailored for the few-shot learning setting.
We show that two existing meta-learning algorithms (MAML and Reptile) can be derived from our bounds.
We derive a new computationally-efficient PACMAML algorithm, and show it outperforms existing meta-learning algorithms on several few-shot benchmark datasets.
arXiv Detail & Related papers (2021-05-28T20:40:40Z) - Meta-Learning with Neural Tangent Kernels [58.06951624702086]
We propose the first meta-learning paradigm in the Reproducing Kernel Hilbert Space (RKHS) induced by the meta-model's Neural Tangent Kernel (NTK)
Within this paradigm, we introduce two meta-learning algorithms, which no longer need a sub-optimal iterative inner-loop adaptation as in the MAML framework.
We achieve this goal by 1) replacing the adaptation with a fast-adaptive regularizer in the RKHS; and 2) solving the adaptation analytically based on the NTK theory.
arXiv Detail & Related papers (2021-02-07T20:53:23Z) - PAC-Bayes Bounds for Meta-learning with Data-Dependent Prior [36.38937352131301]
We derive three novel generalisation error bounds for meta-learning based on PAC-Bayes relative entropy bound.
Experiments illustrate that the proposed three PAC-Bayes bounds for meta-learning guarantee a competitive generalization performance guarantee.
arXiv Detail & Related papers (2021-02-07T09:03:43Z) - On the Global Optimality of Model-Agnostic Meta-Learning [133.16370011229776]
Model-a meta-learning (MAML) formulates meta-learning as a bilevel optimization problem, where the inner level solves each subtask based on a shared prior.
We characterize optimality of the stationary points attained by MAML for both learning and supervised learning, where the inner-level outer-level problems are solved via first-order optimization methods.
arXiv Detail & Related papers (2020-06-23T17:33:14Z) - PACOH: Bayes-Optimal Meta-Learning with PAC-Guarantees [77.67258935234403]
We provide a theoretical analysis using the PAC-Bayesian framework and derive novel generalization bounds for meta-learning.
We develop a class of PAC-optimal meta-learning algorithms with performance guarantees and a principled meta-level regularization.
arXiv Detail & Related papers (2020-02-13T15:01:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.