Related papers: Transfer Learning with Gaussian Processes for Bayesian Optimization

Transfer Learning with Gaussian Processes for Bayesian Optimization

URL: http://arxiv.org/abs/2111.11223v1
Date: Mon, 22 Nov 2021 14:09:45 GMT
Title: Transfer Learning with Gaussian Processes for Bayesian Optimization
Authors: Petru Tighineanu, Kathrin Skubch, Paul Baireuther, Attila Reiss, Felix Berkenkamp, Julia Vinogradska
Abstract summary: We provide a unified view on hierarchical GP models for transfer learning, which allows us to analyze the relationship between methods. We develop a novel closed-form boosted GP transfer model that fits between existing approaches in terms of complexity. We evaluate the performance of the different approaches in large-scale experiments and highlight strengths and weaknesses of the different transfer-learning methods.
Score: 9.933956770453438
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Bayesian optimization is a powerful paradigm to optimize black-box functions based on scarce and noisy data. Its data efficiency can be further improved by transfer learning from related tasks. While recent transfer models meta-learn a prior based on large amount of data, in the low-data regime methods that exploit the closed-form posterior of Gaussian processes (GPs) have an advantage. In this setting, several analytically tractable transfer-model posteriors have been proposed, but the relative advantages of these methods are not well understood. In this paper, we provide a unified view on hierarchical GP models for transfer learning, which allows us to analyze the relationship between methods. As part of the analysis, we develop a novel closed-form boosted GP transfer model that fits between existing approaches in terms of complexity. We evaluate the performance of the different approaches in large-scale experiments and highlight strengths and weaknesses of the different transfer-learning methods.

Related papers

Fuzzy Rule-based Differentiable Representation Learning [16.706014479049493]
This paper introduces a novel representation learning method grounded in an interpretable fuzzy rule-based model. It is built upon the Takagi-Sugeno-Kang fuzzy system (TSK-FS) to initially map input data to a high-dimensional fuzzy feature space. A novel differentiable optimization method is proposed for the consequence part learning which can preserve the model's interpretability and transparency.
arXiv Detail & Related papers (2025-03-16T14:00:34Z)
A Bayesian Approach to Data Point Selection [24.98069363998565]
Data point selection (DPS) is becoming a critical topic in deep learning. Existing approaches to DPS are predominantly based on a bi-level optimisation (BLO) formulation. We propose a novel Bayesian approach to DPS.
arXiv Detail & Related papers (2024-11-06T09:04:13Z)
Heterogeneous Multi-Task Gaussian Cox Processes [61.67344039414193]
We present a novel extension of multi-task Gaussian Cox processes for modeling heterogeneous correlated tasks jointly. A MOGP prior over the parameters of the dedicated likelihoods for classification, regression and point process tasks can facilitate sharing of information between heterogeneous tasks. We derive a mean-field approximation to realize closed-form iterative updates for estimating model parameters.
arXiv Detail & Related papers (2023-08-29T15:01:01Z)
Learning Large-scale Neural Fields via Context Pruned Meta-Learning [60.93679437452872]
We introduce an efficient optimization-based meta-learning technique for large-scale neural field training. We show how gradient re-scaling at meta-test time allows the learning of extremely high-quality neural fields. Our framework is model-agnostic, intuitive, straightforward to implement, and shows significant reconstruction improvements for a wide range of signals.
arXiv Detail & Related papers (2023-02-01T17:32:16Z)
MACE: An Efficient Model-Agnostic Framework for Counterfactual Explanation [132.77005365032468]
We propose a novel framework of Model-Agnostic Counterfactual Explanation (MACE) In our MACE approach, we propose a novel RL-based method for finding good counterfactual examples and a gradient-less descent method for improving proximity. Experiments on public datasets validate the effectiveness with better validity, sparsity and proximity.
arXiv Detail & Related papers (2022-05-31T04:57:06Z)
Invariance Learning in Deep Neural Networks with Differentiable Laplace Approximations [76.82124752950148]
We develop a convenient gradient-based method for selecting the data augmentation. We use a differentiable Kronecker-factored Laplace approximation to the marginal likelihood as our objective.
arXiv Detail & Related papers (2022-02-22T02:51:11Z)
Merging Models with Fisher-Weighted Averaging [24.698591753644077]
We introduce a fundamentally different method for transferring knowledge across models that amounts to "merging" multiple models into one. Our approach effectively involves computing a weighted average of the models' parameters. We show that our merging procedure makes it possible to combine models in previously unexplored ways.
arXiv Detail & Related papers (2021-11-18T17:59:35Z)
Modular Gaussian Processes for Transfer Learning [0.0]
We present a framework for transfer learning based on modular variational Gaussian processes (GP) We develop a module-based method that builds ensemble GP models without revisiting any data. Our method avoids undesired data centralisation, reduces rising computational costs and allows the transfer of learned uncertainty metrics after training.
arXiv Detail & Related papers (2021-10-26T09:15:18Z)
Harnessing Heterogeneity: Learning from Decomposed Feedback in Bayesian Modeling [68.69431580852535]
We introduce a novel GP regression to incorporate the subgroup feedback. Our modified regression has provably lower variance -- and thus a more accurate posterior -- compared to previous approaches. We execute our algorithm on two disparate social problems.
arXiv Detail & Related papers (2021-07-07T03:57:22Z)
Last Layer Marginal Likelihood for Invariance Learning [12.00078928875924]
We introduce a new lower bound to the marginal likelihood, which allows us to perform inference for a larger class of likelihood functions. We work towards bringing this approach to neural networks by using an architecture with a Gaussian process in the last layer.
arXiv Detail & Related papers (2021-06-14T15:40:51Z)
Extrapolation for Large-batch Training in Deep Learning [72.61259487233214]
We show that a host of variations can be covered in a unified framework that we propose. We prove the convergence of this novel scheme and rigorously evaluate its empirical performance on ResNet, LSTM, and Transformer.
arXiv Detail & Related papers (2020-06-10T08:22:41Z)
Sparse Gaussian Processes Revisited: Bayesian Approaches to Inducing-Variable Approximations [27.43948386608]
Variational inference techniques based on inducing variables provide an elegant framework for scalable estimation in Gaussian process (GP) models. In this work we challenge the common wisdom that optimizing the inducing inputs in variational framework yields optimal performance.
arXiv Detail & Related papers (2020-03-06T08:53:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.