Related papers: Efficient Bayesian Optimization with Deep Kernel Learning and Transformer Pre-trained on Multiple Heterogeneous Datasets

Efficient Bayesian Optimization with Deep Kernel Learning and Transformer Pre-trained on Multiple Heterogeneous Datasets

URL: http://arxiv.org/abs/2308.04660v1
Date: Wed, 9 Aug 2023 01:56:10 GMT
Title: Efficient Bayesian Optimization with Deep Kernel Learning and Transformer Pre-trained on Multiple Heterogeneous Datasets
Authors: Wenlong Lyu, Shoubo Hu, Jie Chuai, Zhitang Chen
Abstract summary: We propose a simple approach to pre-train a surrogate, which is a Gaussian process (GP) with a kernel defined on deep features learned from a Transformer-based encoder. Experiments on both synthetic and real benchmark problems demonstrate the effectiveness of our proposed pre-training and transfer BO strategy.
Score: 9.510327380529892
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Bayesian optimization (BO) is widely adopted in black-box optimization problems and it relies on a surrogate model to approximate the black-box response function. With the increasing number of black-box optimization tasks solved and even more to solve, the ability to learn from multiple prior tasks to jointly pre-train a surrogate model is long-awaited to further boost optimization efficiency. In this paper, we propose a simple approach to pre-train a surrogate, which is a Gaussian process (GP) with a kernel defined on deep features learned from a Transformer-based encoder, using datasets from prior tasks with possibly heterogeneous input spaces. In addition, we provide a simple yet effective mix-up initialization strategy for input tokens corresponding to unseen input variables and therefore accelerate new tasks' convergence. Experiments on both synthetic and real benchmark problems demonstrate the effectiveness of our proposed pre-training and transfer BO strategy over existing methods.

Related papers

PABBO: Preferential Amortized Black-Box Optimization [24.019185659134294]
Preferential Bayesian Optimization (PBO) is a sample-efficient method to learn latent user utilities from preferential feedback over a pair of designs. We propose to circumvent this issue by fully amortizing PBO, meta-learning both the surrogate and the acquisition function. Our method is several orders of magnitude faster than the usual Gaussian process-based strategies and often outperforms them in accuracy.
arXiv Detail & Related papers (2025-03-02T14:57:24Z)
Vector Optimization with Gaussian Process Bandits [7.049738935364297]
Learning problems in which multiple objectives must be considered simultaneously often arise in various fields, including engineering, drug design, and environmental management. Traditional methods for dealing with multiple black-box objective functions have limitations in incorporating objective preferences and exploring the solution space accordingly. We propose Vector Optimization with Gaussian Process (VOGP), a probably approximately correct adaptive elimination algorithm that performs black-box vector optimization using Gaussian process bandits.
arXiv Detail & Related papers (2024-12-03T14:47:46Z)
Faster WIND: Accelerating Iterative Best-of-$N$ Distillation for LLM Alignment [81.84950252537618]
This paper reveals a unified game-theoretic connection between iterative BOND and self-play alignment. We establish a novel framework, WIN rate Dominance (WIND), with a series of efficient algorithms for regularized win rate dominance optimization.
arXiv Detail & Related papers (2024-10-28T04:47:39Z)
Bayesian Inverse Transfer in Evolutionary Multiobjective Optimization [29.580786235313987]
We introduce the first Inverse Transfer Multiobjective (invTrEMO) InvTrEMO harnesses the common objective functions in many prevalent areas, even when decision spaces do not precisely align between tasks. InvTrEMO yields high-precision inverse models as a significant byproduct, enabling the generation of tailored solutions on-demand.
arXiv Detail & Related papers (2023-12-22T14:12:18Z)
End-to-End Meta-Bayesian Optimisation with Transformer Neural Processes [52.818579746354665]
This paper proposes the first end-to-end differentiable meta-BO framework that generalises neural processes to learn acquisition functions via transformer architectures. We enable this end-to-end framework with reinforcement learning (RL) to tackle the lack of labelled acquisition data.
arXiv Detail & Related papers (2023-05-25T10:58:46Z)
Transfer Learning for Bayesian Optimization: A Survey [29.229660973338145]
Black-box optimization is a powerful tool that models and optimize such expensive "black-box" functions. Researchers in the BO community propose to incorporate the spirit of transfer learning to accelerate optimization process.
arXiv Detail & Related papers (2023-02-12T14:37:25Z)
A Data-Driven Evolutionary Transfer Optimization for Expensive Problems in Dynamic Environments [9.098403098464704]
Data-driven, a.k.a. surrogate-assisted, evolutionary optimization has been recognized as an effective approach for tackling expensive black-box optimization problems. This paper proposes a simple but effective transfer learning framework to empower data-driven evolutionary optimization to solve dynamic optimization problems. Experiments on synthetic benchmark test problems and a real-world case study demonstrate the effectiveness of our proposed algorithm.
arXiv Detail & Related papers (2022-11-05T11:19:50Z)
An Empirical Evaluation of Zeroth-Order Optimization Methods on AI-driven Molecule Optimization [78.36413169647408]
We study the effectiveness of various ZO optimization methods for optimizing molecular objectives. We show the advantages of ZO sign-based gradient descent (ZO-signGD) We demonstrate the potential effectiveness of ZO optimization methods on widely used benchmark tasks from the Guacamol suite.
arXiv Detail & Related papers (2022-10-27T01:58:10Z)
Tree ensemble kernels for Bayesian optimization with known constraints over mixed-feature spaces [54.58348769621782]
Tree ensembles can be well-suited for black-box optimization tasks such as algorithm tuning and neural architecture search. Two well-known challenges in using tree ensembles for black-box optimization are (i) effectively quantifying model uncertainty for exploration and (ii) optimizing over the piece-wise constant acquisition function. Our framework performs as well as state-of-the-art methods for unconstrained black-box optimization over continuous/discrete features and outperforms competing methods for problems combining mixed-variable feature spaces and known input constraints.
arXiv Detail & Related papers (2022-07-02T16:59:37Z)
Towards Learning Universal Hyperparameter Optimizers with Transformers [57.35920571605559]
We introduce the OptFormer, the first text-based Transformer HPO framework that provides a universal end-to-end interface for jointly learning policy and function prediction. Our experiments demonstrate that the OptFormer can imitate at least 7 different HPO algorithms, which can be further improved via its function uncertainty estimates.
arXiv Detail & Related papers (2022-05-26T12:51:32Z)
Outlier-Robust Sparse Estimation via Non-Convex Optimization [73.18654719887205]
We explore the connection between high-dimensional statistics and non-robust optimization in the presence of sparsity constraints. We develop novel and simple optimization formulations for these problems. As a corollary, we obtain that any first-order method that efficiently converges to station yields an efficient algorithm for these tasks.
arXiv Detail & Related papers (2021-09-23T17:38:24Z)
Few-Shot Bayesian Optimization with Deep Kernel Surrogates [7.208515071018781]
We propose a few-shot learning problem in which we train a shared deep surrogate model to adapt to the response function of a new task. We propose the use of a deep kernel network for a Gaussian process surrogate that is meta-learned in an end-to-end fashion. As a result, the novel few-shot optimization of our deep kernel surrogate leads to new state-of-the-art results at HPO.
arXiv Detail & Related papers (2021-01-19T15:00:39Z)
Federated Bayesian Optimization via Thompson Sampling [33.087439644066876]
This paper presents federated Thompson sampling (FTS) which overcomes a number of key challenges of FBO and FL in a principled way. We empirically demonstrate the effectiveness of FTS in terms of communication efficiency, computational efficiency, and practical performance.
arXiv Detail & Related papers (2020-10-20T09:42:17Z)

This list is automatically generated from the titles and abstracts of the papers in this site.