Related papers: Bayesian Optimization via Continual Variational Last Layer Training

Bayesian Optimization via Continual Variational Last Layer Training

URL: http://arxiv.org/abs/2412.09477v1
Date: Thu, 12 Dec 2024 17:21:50 GMT
Title: Bayesian Optimization via Continual Variational Last Layer Training
Authors: Paul Brunzema, Mikkel Jordahn, John Willes, Sebastian Trimpe, Jasper Snoek, James Harrison,
Abstract summary: We build on variational Bayesian last layers (VBLLs) to connect training of these models to exact conditioning in GPs.<n>We exploit this connection to develop an efficient online training algorithm that interleaves conditioning and optimization.<n>Our findings suggest that VBLL networks significantly outperform GPs and other BNN architectures on tasks with complex input correlations.
Score: 16.095427911235646
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Gaussian Processes (GPs) are widely seen as the state-of-the-art surrogate models for Bayesian optimization (BO) due to their ability to model uncertainty and their performance on tasks where correlations are easily captured (such as those defined by Euclidean metrics) and their ability to be efficiently updated online. However, the performance of GPs depends on the choice of kernel, and kernel selection for complex correlation structures is often difficult or must be made bespoke. While Bayesian neural networks (BNNs) are a promising direction for higher capacity surrogate models, they have so far seen limited use due to poor performance on some problem types. In this paper, we propose an approach which shows competitive performance on many problem types, including some that BNNs typically struggle with. We build on variational Bayesian last layers (VBLLs), and connect training of these models to exact conditioning in GPs. We exploit this connection to develop an efficient online training algorithm that interleaves conditioning and optimization. Our findings suggest that VBLL networks significantly outperform GPs and other BNN architectures on tasks with complex input correlations, and match the performance of well-tuned GPs on established benchmark tasks.

Related papers

Which Optimizer Works Best for Physics-Informed Neural Networks and Kolmogorov-Arnold Networks? [1.8175282137722093]
We compare PINNs and PIKANs on key challenging linear, stiff, multi-scale non-linear PDEs including Burgers, Allen-Cashinsky, Ginzburg-Landau equations. Our results reveal improvements without the use of any other enhancements typically employed in PINNs and PIKANs.
arXiv Detail & Related papers (2025-01-22T21:19:42Z)
Revisiting the Equivalence of Bayesian Neural Networks and Gaussian Processes: On the Importance of Learning Activations [1.0468715529145969]
We show that trainable activations are crucial for effective mapping of GP priors to wide BNNs. We also introduce trainable periodic activations that ensure global stationarity by design.
arXiv Detail & Related papers (2024-10-21T08:42:10Z)
A Study of Bayesian Neural Network Surrogates for Bayesian Optimization [46.97686790714025]
Bayesian neural networks (BNNs) have recently become practical function approximators. We study BNNs as alternatives to standard GP surrogates for optimization.
arXiv Detail & Related papers (2023-05-31T17:00:00Z)
Deep Negative Correlation Classification [82.45045814842595]
Existing deep ensemble methods naively train many different models and then aggregate their predictions. We propose deep negative correlation classification (DNCC) DNCC yields a deep classification ensemble where the individual estimator is both accurate and negatively correlated.
arXiv Detail & Related papers (2022-12-14T07:35:20Z)
GNN at the Edge: Cost-Efficient Graph Neural Network Processing over Distributed Edge Servers [24.109721494781592]
Graph Neural Networks (GNNs) are still under exploration, presenting a stark disparity to its broad edge adoptions. This paper studies the cost optimization for distributed GNN processing over a multi-tier heterogeneous edge network. We show that our approach achieves superior performance over de facto baselines with more than 95.8% cost eduction in a fast convergence speed.
arXiv Detail & Related papers (2022-10-31T13:03:16Z)
Recurrent Bilinear Optimization for Binary Neural Networks [58.972212365275595]
BNNs neglect the intrinsic bilinear relationship of real-valued weights and scale factors. Our work is the first attempt to optimize BNNs from the bilinear perspective. We obtain robust RBONNs, which show impressive performance over state-of-the-art BNNs on various models and datasets.
arXiv Detail & Related papers (2022-09-04T06:45:33Z)
Slimmable Domain Adaptation [112.19652651687402]
We introduce a simple framework, Slimmable Domain Adaptation, to improve cross-domain generalization with a weight-sharing model bank. Our framework surpasses other competing approaches by a very large margin on multiple benchmarks.
arXiv Detail & Related papers (2022-06-14T06:28:04Z)
Deep Architecture Connectivity Matters for Its Convergence: A Fine-Grained Analysis [94.64007376939735]
We theoretically characterize the impact of connectivity patterns on the convergence of deep neural networks (DNNs) under gradient descent training. We show that by a simple filtration on "unpromising" connectivity patterns, we can trim down the number of models to evaluate.
arXiv Detail & Related papers (2022-05-11T17:43:54Z)
Encoding the latent posterior of Bayesian Neural Networks for uncertainty quantification [10.727102755903616]
We aim for efficient deep BNNs amenable to complex computer vision architectures. We achieve this by leveraging variational autoencoders (VAEs) to learn the interaction and the latent distribution of the parameters at each network layer. Our approach, Latent-Posterior BNN (LP-BNN), is compatible with the recent BatchEnsemble method, leading to highly efficient (in terms of computation and memory during both training and testing) ensembles.
arXiv Detail & Related papers (2020-12-04T19:50:09Z)
Belief Propagation Neural Networks [103.97004780313105]
We introduce belief propagation neural networks (BPNNs) BPNNs operate on factor graphs and generalize Belief propagation (BP) We show that BPNNs converges 1.7x faster on Ising models while providing tighter bounds. On challenging model counting problems, BPNNs compute estimates 100's of times faster than state-of-the-art handcrafted methods.
arXiv Detail & Related papers (2020-07-01T07:39:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.