Meta-learning Feature Representations for Adaptive Gaussian Processes
via Implicit Differentiation
- URL: http://arxiv.org/abs/2205.02708v1
- Date: Thu, 5 May 2022 15:26:53 GMT
- Title: Meta-learning Feature Representations for Adaptive Gaussian Processes
via Implicit Differentiation
- Authors: Wenlin Chen, Austin Tripp, Jos\'e Miguel Hern\'andez-Lobato
- Abstract summary: We propose a general framework for learning deep kernels by interpolating between meta-learning and conventional learning.
Although ADKF is a completely general method, we argue that it is especially well-suited for drug discovery problems.
- Score: 1.5293427903448025
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose Adaptive Deep Kernel Fitting (ADKF), a general framework for
learning deep kernels by interpolating between meta-learning and conventional
learning. Our approach employs a bilevel optimization objective where we
meta-learn feature representations that are generally useful across tasks, in
the sense that task-specific Gaussian process models estimated on top of such
features achieve the lowest possible predictive loss on average across tasks.
We solve the resulting nested optimization problem using the implicit function
theorem. We show that ADKF contains Deep Kernel Learning and Deep Kernel
Transfer as special cases. Although ADKF is a completely general method, we
argue that it is especially well-suited for drug discovery problems and
demonstrate that it significantly outperforms previous state-of-the-art methods
on a variety of real-world few-shot molecular property prediction tasks and
out-of-domain molecular optimization tasks.
Related papers
- Unlearning as multi-task optimization: A normalized gradient difference approach with an adaptive learning rate [105.86576388991713]
We introduce a normalized gradient difference (NGDiff) algorithm, enabling us to have better control over the trade-off between the objectives.
We provide a theoretical analysis and empirically demonstrate the superior performance of NGDiff among state-of-the-art unlearning methods on the TOFU and MUSE datasets.
arXiv Detail & Related papers (2024-10-29T14:41:44Z) - Global Optimization of Gaussian Process Acquisition Functions Using a Piecewise-Linear Kernel Approximation [2.3342885570554652]
We introduce a piecewise approximation for process kernels and a corresponding MIQP representation for acquisition functions.
We empirically demonstrate the framework on synthetic functions, constrained benchmarks, and hyper tuning tasks.
arXiv Detail & Related papers (2024-10-22T10:56:52Z) - Physics Inspired Approaches To Understanding Gaussian Processes [0.9712140341805067]
We contribute an analysis of the loss landscape for GP models using methods from physics.
We demonstrate $nu$-continuity for Matern kernels and outline aspects of catastrophe theory at critical points in the loss landscape.
We also provide an a priori method for evaluating the effect of GP ensembles and discuss various voting approaches based on physical properties of the loss landscape.
arXiv Detail & Related papers (2023-05-18T06:39:07Z) - A Generalized EigenGame with Extensions to Multiview Representation
Learning [0.28647133890966997]
Generalized Eigenvalue Problems (GEPs) encompass a range of interesting dimensionality reduction methods.
We develop an approach to solving GEPs in which all constraints are softly enforced by Lagrange multipliers.
We show that our approaches share much of the theoretical grounding of the previous Hebbian and game theoretic approaches for the linear case.
We demonstrate the effectiveness of our method for solving GEPs in the setting of canonical multiview datasets.
arXiv Detail & Related papers (2022-11-21T10:11:13Z) - Scalable PAC-Bayesian Meta-Learning via the PAC-Optimal Hyper-Posterior:
From Theory to Practice [54.03076395748459]
A central question in the meta-learning literature is how to regularize to ensure generalization to unseen tasks.
We present a generalization bound for meta-learning, which was first derived by Rothfuss et al.
We provide a theoretical analysis and empirical case study under which conditions and to what extent these guarantees for meta-learning improve upon PAC-Bayesian per-task learning bounds.
arXiv Detail & Related papers (2022-11-14T08:51:04Z) - MARS: Meta-Learning as Score Matching in the Function Space [79.73213540203389]
We present a novel approach to extracting inductive biases from a set of related datasets.
We use functional Bayesian neural network inference, which views the prior as a process and performs inference in the function space.
Our approach can seamlessly acquire and represent complex prior knowledge by metalearning the score function of the data-generating process.
arXiv Detail & Related papers (2022-10-24T15:14:26Z) - Multi-Task Learning on Networks [0.0]
Multi-objective optimization problems arising in the multi-task learning context have specific features and require adhoc methods.
In this thesis the solutions in the Input Space are represented as probability distributions encapsulating the knowledge contained in the function evaluations.
In this space of probability distributions, endowed with the metric given by the Wasserstein distance, a new algorithm MOEA/WST can be designed in which the model is not directly on the objective function.
arXiv Detail & Related papers (2021-12-07T09:13:10Z) - Efficient Model-Based Multi-Agent Mean-Field Reinforcement Learning [89.31889875864599]
We propose an efficient model-based reinforcement learning algorithm for learning in multi-agent systems.
Our main theoretical contributions are the first general regret bounds for model-based reinforcement learning for MFC.
We provide a practical parametrization of the core optimization problem.
arXiv Detail & Related papers (2021-07-08T18:01:02Z) - Meta-Learning with Neural Tangent Kernels [58.06951624702086]
We propose the first meta-learning paradigm in the Reproducing Kernel Hilbert Space (RKHS) induced by the meta-model's Neural Tangent Kernel (NTK)
Within this paradigm, we introduce two meta-learning algorithms, which no longer need a sub-optimal iterative inner-loop adaptation as in the MAML framework.
We achieve this goal by 1) replacing the adaptation with a fast-adaptive regularizer in the RKHS; and 2) solving the adaptation analytically based on the NTK theory.
arXiv Detail & Related papers (2021-02-07T20:53:23Z) - On the Global Optimality of Model-Agnostic Meta-Learning [133.16370011229776]
Model-a meta-learning (MAML) formulates meta-learning as a bilevel optimization problem, where the inner level solves each subtask based on a shared prior.
We characterize optimality of the stationary points attained by MAML for both learning and supervised learning, where the inner-level outer-level problems are solved via first-order optimization methods.
arXiv Detail & Related papers (2020-06-23T17:33:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.