Related papers: Towards Scalable Bayesian Optimization via Gradient-Informed Bayesian Neural Networks

Towards Scalable Bayesian Optimization via Gradient-Informed Bayesian Neural Networks

URL: http://arxiv.org/abs/2504.10076v1
Date: Mon, 14 Apr 2025 10:21:08 GMT
Title: Towards Scalable Bayesian Optimization via Gradient-Informed Bayesian Neural Networks
Authors: Georgios Makrygiorgos, Joshua Hang Sai Ip, Ali Mesbah,
Abstract summary: We propose a gradient-informed loss function for BNN training, effectively augmenting function observations with local gradient information.<n>The effectiveness of this approach is demonstrated on well-known benchmarks in terms of improved BNN predictions and faster BO convergence.
Score: 6.165163123577486
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Bayesian optimization (BO) is a widely used method for data-driven optimization that generally relies on zeroth-order data of objective function to construct probabilistic surrogate models. These surrogates guide the exploration-exploitation process toward finding global optimum. While Gaussian processes (GPs) are commonly employed as surrogates of the unknown objective function, recent studies have highlighted the potential of Bayesian neural networks (BNNs) as scalable and flexible alternatives. Moreover, incorporating gradient observations into GPs, when available, has been shown to improve BO performance. However, the use of gradients within BNN surrogates remains unexplored. By leveraging automatic differentiation, gradient information can be seamlessly integrated into BNN training, resulting in more informative surrogates for BO. We propose a gradient-informed loss function for BNN training, effectively augmenting function observations with local gradient information. The effectiveness of this approach is demonstrated on well-known benchmarks in terms of improved BNN predictions and faster BO convergence as the number of decision variables increases.

Related papers

VBO-MI: A Fully Gradient-Based Bayesian Optimization Framework Using Variational Mutual Information Estimation [1.0829694003408499]
VBO-MI is a fully gradient-based BO framework that leverages recent advances in variational mutual information estimation.<n>We evaluate our method on a diverse suite of benchmarks, including high-dimensional synthetic functions and complex real-world tasks.
arXiv Detail & Related papers (2026-01-13T03:07:52Z)
Optimizers Qualitatively Alter Solutions And We Should Leverage This [62.662640460717476]
Deep Neural Networks (DNNs) can not guarantee convergence to a unique global minimum of the loss when using only local information, such as SGD.<n>We argue that the community should aim at understanding the biases of already existing methods, as well as aim to build new DNNs with the explicit intent of inducing certain properties of the solution.
arXiv Detail & Related papers (2025-07-16T13:33:31Z)
BiDense: Binarization for Dense Prediction [62.70804353158387]
BiDense is a generalized binary neural network (BNN) designed for efficient and accurate dense prediction tasks. BiDense incorporates two key techniques: the Distribution-adaptive Binarizer (DAB) and the Channel-adaptive Full-precision Bypass (CFB)
arXiv Detail & Related papers (2024-11-15T16:46:04Z)
PG-LBO: Enhancing High-Dimensional Bayesian Optimization with Pseudo-Label and Gaussian Process Guidance [31.585328335396607]
Current mainstream methods overlook the potential of utilizing a pool of unlabeled data to construct the latent space. We propose a novel method to effectively utilize unlabeled data with the guidance of labeled data. Our proposed method outperforms existing VAE-BO algorithms in various optimization scenarios.
arXiv Detail & Related papers (2023-12-28T11:57:58Z)
Implicit Stochastic Gradient Descent for Training Physics-informed Neural Networks [51.92362217307946]
Physics-informed neural networks (PINNs) have effectively been demonstrated in solving forward and inverse differential equation problems. PINNs are trapped in training failures when the target functions to be approximated exhibit high-frequency or multi-scale features. In this paper, we propose to employ implicit gradient descent (ISGD) method to train PINNs for improving the stability of training process.
arXiv Detail & Related papers (2023-03-03T08:17:47Z)
Recurrent Bilinear Optimization for Binary Neural Networks [58.972212365275595]
BNNs neglect the intrinsic bilinear relationship of real-valued weights and scale factors. Our work is the first attempt to optimize BNNs from the bilinear perspective. We obtain robust RBONNs, which show impressive performance over state-of-the-art BNNs on various models and datasets.
arXiv Detail & Related papers (2022-09-04T06:45:33Z)
Invariance Learning in Deep Neural Networks with Differentiable Laplace Approximations [76.82124752950148]
We develop a convenient gradient-based method for selecting the data augmentation. We use a differentiable Kronecker-factored Laplace approximation to the marginal likelihood as our objective.
arXiv Detail & Related papers (2022-02-22T02:51:11Z)
End-to-End Learning of Deep Kernel Acquisition Functions for Bayesian Optimization [39.56814839510978]
We propose a meta-learning method for Bayesian optimization with neural network-based kernels. Our model is trained by a reinforcement learning framework from multiple tasks. In experiments using three text document datasets, we demonstrate that the proposed method achieves better BO performance than the existing methods.
arXiv Detail & Related papers (2021-11-01T00:42:31Z)
Scalable and Flexible Deep Bayesian Optimization with Auxiliary Information for Scientific Problems [10.638330155988145]
We propose performing Bayesian optimization on complex, structured problems by using Bayesian Neural Networks (BNNs) BNNs have the representation power and flexibility to handle structured data and exploit auxiliary information. We show that BNNs often outperform GPs as surrogate models for BO in terms of both sampling efficiency and computational cost.
arXiv Detail & Related papers (2021-04-23T15:46:37Z)
Gradient-Free Adversarial Attacks for Bayesian Neural Networks [9.797319790710713]
adversarial examples underscore the importance of understanding the robustness of machine learning models. In this work, we employ gradient-free optimization methods in order to find adversarial examples for BNNs.
arXiv Detail & Related papers (2020-12-23T13:19:11Z)
Improving predictions of Bayesian neural nets via local linearization [79.21517734364093]
We argue that the Gauss-Newton approximation should be understood as a local linearization of the underlying Bayesian neural network (BNN) Because we use this linearized model for posterior inference, we should also predict using this modified model instead of the original one. We refer to this modified predictive as "GLM predictive" and show that it effectively resolves common underfitting problems of the Laplace approximation.
arXiv Detail & Related papers (2020-08-19T12:35:55Z)
Bayesian Graph Neural Networks with Adaptive Connection Sampling [62.51689735630133]
We propose a unified framework for adaptive connection sampling in graph neural networks (GNNs) The proposed framework not only alleviates over-smoothing and over-fitting tendencies of deep GNNs, but also enables learning with uncertainty in graph analytic tasks with GNNs.
arXiv Detail & Related papers (2020-06-07T07:06:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.