ZeroShotOpt: Towards Zero-Shot Pretrained Models for Efficient Black-Box Optimization
- URL: http://arxiv.org/abs/2510.03051v1
- Date: Fri, 03 Oct 2025 14:33:23 GMT
- Title: ZeroShotOpt: Towards Zero-Shot Pretrained Models for Efficient Black-Box Optimization
- Authors: Jamison Meindl, Yunsheng Tian, Tony Cui, Veronika Thost, Zhang-Wei Hong, Johannes Dürholt, Jie Chen, Wojciech Matusik, Mina Konaković Luković,
- Abstract summary: We present ZeroShot, a general-purpose, pretrained model for continuous black-box optimization tasks ranging from 2D to 20D.<n>Our approach leverages offline reinforcement learning on large-scale optimization tasks collected from 12 BO variants.
- Score: 31.894110383242566
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Global optimization of expensive, derivative-free black-box functions requires extreme sample efficiency. While Bayesian optimization (BO) is the current state-of-the-art, its performance hinges on surrogate and acquisition function hyper-parameters that are often hand-tuned and fail to generalize across problem landscapes. We present ZeroShotOpt, a general-purpose, pretrained model for continuous black-box optimization tasks ranging from 2D to 20D. Our approach leverages offline reinforcement learning on large-scale optimization trajectories collected from 12 BO variants. To scale pretraining, we generate millions of synthetic Gaussian process-based functions with diverse landscapes, enabling the model to learn transferable optimization policies. As a result, ZeroShotOpt achieves robust zero-shot generalization on a wide array of unseen benchmarks, matching or surpassing the sample efficiency of leading global optimizers, including BO, while also offering a reusable foundation for future extensions and improvements. Our open-source code, dataset, and model are available at: https://github.com/jamisonmeindl/zeroshotopt
Related papers
- GPTOpt: Towards Efficient LLM-Based Black-Box Optimization [33.09351655863645]
Large Language Models (LLMs) have shown broad capabilities, yet state-of-the-art models remain limited in solving continuous black-box optimization tasks.<n>We introduce GPTOpt, an LLM-based optimization method that equips LLMs with continuous black-box optimization capabilities.
arXiv Detail & Related papers (2025-10-29T11:21:55Z) - Simplifying Bayesian Optimization Via In-Context Direct Optimum Sampling [18.852567298468742]
We propose a completely in-context zero-shot solution for BO that does not require surrogate fitting or acquisition function optimization.<n>This is done by using a pre-trained context model to directly sample from the posterior over the optimum point.<n>We achieve an efficiency gain of more than 35x in terms of wall-clock time when compared with process-based BO.
arXiv Detail & Related papers (2025-05-29T18:07:36Z) - Pretrained Optimization Model for Zero-Shot Black Box Optimization [16.391389860521134]
We propose a Pretrained Optimization Model (POM) that leverages knowledge gained from optimizing diverse tasks.<n>POM offers efficient solutions to zero-shot optimization through direct application or fine-tuning with few-shot samples.<n>Fine-tuning POM with a small number of samples and budget yields significant performance improvements.
arXiv Detail & Related papers (2024-05-06T09:11:49Z) - Reinforced In-Context Black-Box Optimization [64.25546325063272]
RIBBO is a method to reinforce-learn a BBO algorithm from offline data in an end-to-end fashion.
RIBBO employs expressive sequence models to learn the optimization histories produced by multiple behavior algorithms and tasks.
Central to our method is to augment the optimization histories with textitregret-to-go tokens, which are designed to represent the performance of an algorithm based on cumulative regret over the future part of the histories.
arXiv Detail & Related papers (2024-02-27T11:32:14Z) - Predictive Modeling through Hyper-Bayesian Optimization [60.586813904500595]
We propose a novel way of integrating model selection and BO for the single goal of reaching the function optima faster.
The algorithm moves back and forth between BO in the model space and BO in the function space, where the goodness of the recommended model is captured.
In addition to improved sample efficiency, the framework outputs information about the black-box function.
arXiv Detail & Related papers (2023-08-01T04:46:58Z) - Large-Batch, Iteration-Efficient Neural Bayesian Design Optimization [37.339567743948955]
We present a novel Bayesian optimization framework specifically tailored to address the limitations of BO.
Our key contribution is a highly scalable, sample-based acquisition function that performs a non-dominated sorting of objectives.
We show that our acquisition function in combination with different Bayesian neural network surrogates is effective in data-intensive environments with a minimal number of iterations.
arXiv Detail & Related papers (2023-06-01T19:10:57Z) - High-Dimensional Bayesian Optimization via Semi-Supervised Learning with
Optimized Unlabeled Data Sampling [6.927830939687371]
$texttTSBO$ incorporates a teacher model, an unlabeled data sampler, and a student model.
The student is trained on unlabeled data locations generated by the sampler, with pseudo labels predicted by the teacher.
$texttTSBO$ demonstrates significantly improved sample-efficiency in several global optimization tasks under tight labeled data budgets.
arXiv Detail & Related papers (2023-05-04T07:43:40Z) - VeLO: Training Versatile Learned Optimizers by Scaling Up [67.90237498659397]
We leverage the same scaling approach behind the success of deep learning to learn versatiles.
We train an ingest for deep learning which is itself a small neural network that ingests and outputs parameter updates.
We open source our learned, meta-training code, the associated train test data, and an extensive benchmark suite with baselines at velo-code.io.
arXiv Detail & Related papers (2022-11-17T18:39:07Z) - Pre-training helps Bayesian optimization too [49.28382118032923]
We seek an alternative practice for setting functional priors.
In particular, we consider the scenario where we have data from similar functions that allow us to pre-train a tighter distribution a priori.
Our results show that our method is able to locate good hyper parameters at least 3 times more efficiently than the best competing methods.
arXiv Detail & Related papers (2022-07-07T04:42:54Z) - Pre-trained Gaussian Processes for Bayesian Optimization [24.730678780782647]
We propose a new pre-training based BO framework named HyperBO.
We show bounded posterior predictions and near-zero regrets for HyperBO without assuming the "ground truth" GP prior is known.
arXiv Detail & Related papers (2021-09-16T20:46:26Z) - Bayesian Optimization for Selecting Efficient Machine Learning Models [53.202224677485525]
We present a unified Bayesian Optimization framework for jointly optimizing models for both prediction effectiveness and training efficiency.
Experiments on model selection for recommendation tasks indicate models selected this way significantly improves model training efficiency.
arXiv Detail & Related papers (2020-08-02T02:56:30Z) - BOSH: Bayesian Optimization by Sampling Hierarchically [10.10241176664951]
We propose a novel BO routine pairing a hierarchical Gaussian process with an information-theoretic framework to generate a growing pool of realizations.
We demonstrate that BOSH provides more efficient and higher-precision optimization than standard BO across synthetic benchmarks, simulation optimization, reinforcement learning and hyper- parameter tuning tasks.
arXiv Detail & Related papers (2020-07-02T07:35:49Z) - Global Optimization of Gaussian processes [52.77024349608834]
We propose a reduced-space formulation with trained Gaussian processes trained on few data points.
The approach also leads to significantly smaller and computationally cheaper sub solver for lower bounding.
In total, we reduce time convergence by orders of orders of the proposed method.
arXiv Detail & Related papers (2020-05-21T20:59:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.