Related papers: Optimizing the Unknown: Black Box Bayesian Optimization with Energy-Based Model and Reinforcement Learning

Optimizing the Unknown: Black Box Bayesian Optimization with Energy-Based Model and Reinforcement Learning

URL: http://arxiv.org/abs/2510.19530v1
Date: Wed, 22 Oct 2025 12:36:49 GMT
Title: Optimizing the Unknown: Black Box Bayesian Optimization with Energy-Based Model and Reinforcement Learning
Authors: Ruiyao Miao, Junren Xiao, Shiya Tsang, Hui Xiong, Yingnian Wu,
Abstract summary: Black-Box Optimization (BBO) has achieved success across various scientific and engineering domains.<n>We propose the Reinforced Energy-Based Model for Bayesian Optimization (REBMBO), which integrates Gaussian Processes (GP) for local guidance with an Energy-Based Model (EBM) to capture global structural information.<n>We conduct extensive experiments on synthetic and real-world benchmarks, confirming the superior performance of REBMBO.
Score: 42.508822373669936
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Existing Bayesian Optimization (BO) methods typically balance exploration and exploitation to optimize costly objective functions. However, these methods often suffer from a significant one-step bias, which may lead to convergence towards local optima and poor performance in complex or high-dimensional tasks. Recently, Black-Box Optimization (BBO) has achieved success across various scientific and engineering domains, particularly when function evaluations are costly and gradients are unavailable. Motivated by this, we propose the Reinforced Energy-Based Model for Bayesian Optimization (REBMBO), which integrates Gaussian Processes (GP) for local guidance with an Energy-Based Model (EBM) to capture global structural information. Notably, we define each Bayesian Optimization iteration as a Markov Decision Process (MDP) and use Proximal Policy Optimization (PPO) for adaptive multi-step lookahead, dynamically adjusting the depth and direction of exploration to effectively overcome the limitations of traditional BO methods. We conduct extensive experiments on synthetic and real-world benchmarks, confirming the superior performance of REBMBO. Additional analyses across various GP configurations further highlight its adaptability and robustness.

Related papers

TL-GRPO: Turn-Level RL for Reasoning-Guided Iterative Optimization [97.18886232580131]
Large language models have demonstrated strong reasoning capabilities in complex tasks through tool integration.<n>We propose Turn-Level GRPO, a lightweight RL algorithm that performs turn-level group sampling for fine-grained optimization.
arXiv Detail & Related papers (2026-01-23T06:21:33Z)
VBO-MI: A Fully Gradient-Based Bayesian Optimization Framework Using Variational Mutual Information Estimation [1.0829694003408499]
VBO-MI is a fully gradient-based BO framework that leverages recent advances in variational mutual information estimation.<n>We evaluate our method on a diverse suite of benchmarks, including high-dimensional synthetic functions and complex real-world tasks.
arXiv Detail & Related papers (2026-01-13T03:07:52Z)
None To Optima in Few Shots: Bayesian Optimization with MDP Priors [40.4319486959011]
We introduce the Procedure-inFormed BO (ProfBO) algorithm, which solves black-box optimization with remarkably few function evaluations.<n>ProfBO consistently outperforms state-of-the-art methods by achieving high-quality tuning solutions with significantly fewer evaluations.
arXiv Detail & Related papers (2025-11-02T16:53:17Z)
Divergence Minimization Preference Optimization for Diffusion Model Alignment [66.31417479052774]
Divergence Minimization Preference Optimization (DMPO) is a principled method for aligning diffusion models by minimizing reverse KL divergence.<n>DMPO can consistently outperform or match existing techniques across different base models and test sets.
arXiv Detail & Related papers (2025-07-10T07:57:30Z)
Direct Regret Optimization in Bayesian Optimization [10.705151736050967]
We propose a novel direct regret optimization approach that jointly learns the optimal model and non-myopic acquisition.<n>We show that our method consistently outperforms BO baselines, achieving lower simple regret and demonstrating more robust exploration.
arXiv Detail & Related papers (2025-07-09T04:09:58Z)
Nonmyopic Global Optimisation via Approximate Dynamic Programming [14.389086937116582]
We introduce novel nonmyopic acquisition strategies tailored to IDW- and RBF-based global optimisation.<n>Specifically, we develop dynamic programming-based paradigms, including rollout and multi-step scenario-based optimisation schemes.
arXiv Detail & Related papers (2024-12-06T09:25:00Z)
Batched Bayesian optimization by maximizing the probability of including the optimum [44.38372821900645]
We propose an acquisition strategy for discrete optimization motivated by pure exploitation, qPO (multipoint of Optimality)<n>We apply our method to the model-guided exploration of large chemical libraries and provide empirical evidence that it is competitive with and complements other state-of-the-art methods in batched Bayesian optimization.
arXiv Detail & Related papers (2024-10-08T20:13:12Z)
Enhanced Bayesian Optimization via Preferential Modeling of Abstract Properties [49.351577714596544]
We propose a human-AI collaborative Bayesian framework to incorporate expert preferences about unmeasured abstract properties into surrogate modeling. We provide an efficient strategy that can also handle any incorrect/misleading expert bias in preferential judgments.
arXiv Detail & Related papers (2024-02-27T09:23:13Z)
Large Language Models to Enhance Bayesian Optimization [57.474613739645605]
We present LLAMBO, a novel approach that integrates the capabilities of Large Language Models (LLM) within Bayesian optimization. At a high level, we frame the BO problem in natural language, enabling LLMs to iteratively propose and evaluate promising solutions conditioned on historical evaluations. Our findings illustrate that LLAMBO is effective at zero-shot warmstarting, and enhances surrogate modeling and candidate sampling, especially in the early stages of search when observations are sparse.
arXiv Detail & Related papers (2024-02-06T11:44:06Z)
Poisson Process for Bayesian Optimization [126.51200593377739]
We propose a ranking-based surrogate model based on the Poisson process and introduce an efficient BO framework, namely Poisson Process Bayesian Optimization (PoPBO) Compared to the classic GP-BO method, our PoPBO has lower costs and better robustness to noise, which is verified by abundant experiments.
arXiv Detail & Related papers (2024-02-05T02:54:50Z)
Learning Regions of Interest for Bayesian Optimization with Adaptive Level-Set Estimation [84.0621253654014]
We propose a framework, called BALLET, which adaptively filters for a high-confidence region of interest. We show theoretically that BALLET can efficiently shrink the search space, and can exhibit a tighter regret bound than standard BO.
arXiv Detail & Related papers (2023-07-25T09:45:47Z)
Sparse Bayesian Optimization [16.867375370457438]
We present several regularization-based approaches that allow us to discover sparse and more interpretable configurations. We propose a novel differentiable relaxation based on homotopy continuation that makes it possible to target sparsity. We show that we are able to efficiently optimize for sparsity.
arXiv Detail & Related papers (2022-03-03T18:25:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.