Language Model Embeddings Can Be Sufficient for Bayesian Optimization
- URL: http://arxiv.org/abs/2410.10190v3
- Date: Thu, 09 Oct 2025 17:20:18 GMT
- Title: Language Model Embeddings Can Be Sufficient for Bayesian Optimization
- Authors: Tung Nguyen, Qiuyi Zhang, Bangding Yang, Chansoo Lee, Jorg Bornschein, Yingjie Miao, Sagi Perel, Yutian Chen, Xingyou Song,
- Abstract summary: We show that representing inputs as strings enables general-purpose regression across diverse domains.<n>Our approach achieves optimization performance comparable to state-of-the-art Gaussian Process-based methods such as Google Vizier.
- Score: 15.105661404305005
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Bayesian Optimization is ubiquitous in experimental design and black-box optimization for improving search efficiency. However, most existing approaches rely on regression models which are limited to fixed search spaces and structured, tabular input features. This paper explores the use of LLM embeddings over string inputs for in-context regression in Bayesian Optimization. Our results show that representing inputs as strings enables general-purpose regression across diverse domains, including synthetic, combinatorial, and hyperparameter optimization. Furthermore, our approach achieves optimization performance comparable to state-of-the-art Gaussian Process-based methods such as Google Vizier, and demonstrates potential for broader and more flexible applications.
Related papers
- Local Entropy Search over Descent Sequences for Bayesian Optimization [48.7994415668802]
A practical alternative is to iteratively refine the neighborhood of an initial design using local optimization methods such as gradient descent.<n>We propose local entropy search (LES), a Bayesian optimization paradigm that explicitly targets the solutions reachable by the descent sequences.
arXiv Detail & Related papers (2025-11-24T15:52:17Z) - A Novel Unified Parametric Assumption for Nonconvex Optimization [53.943470475510196]
Non optimization is central to machine learning, but the general framework non convexity enables weak convergence guarantees too pessimistic compared to the other hand.<n>We introduce a novel unified assumption in non convex algorithms.
arXiv Detail & Related papers (2025-02-17T21:25:31Z) - Improving Existing Optimization Algorithms with LLMs [0.9668407688201361]
This paper investigates how Large Language Models (LLMs) can enhance existing optimization algorithms.<n>Using their pre-trained knowledge, we demonstrate their ability to propose innovative variations and implementation strategies.<n>Our results show that an alternative proposed by GPT-4o outperforms the expert-designed of CMSA.
arXiv Detail & Related papers (2025-02-12T10:58:57Z) - Align-Pro: A Principled Approach to Prompt Optimization for LLM Alignment [40.71270945505082]
Large language models (LLMs) are increasingly integrated into various societal and decision-making processes.<n>Traditional methods, such as reinforcement learning from human feedback (RLHF), achieve alignment by fine-tuning model parameters.<n>In contrast, prompt optimization is a viable alternative to RLHF for LLM alignment.
arXiv Detail & Related papers (2025-01-07T03:14:39Z) - Indirect Query Bayesian Optimization with Integrated Feedback [17.66813850517961]
We develop a new class of Bayesian optimization problems where integrated feedback is given via a conditional expectation of the unknown function $f$ to be optimized.<n>The goal is to find the global optimum of $f$ by adaptively querying and observing in the space transformed by the conditional distribution.<n>This is motivated by real-world applications where one cannot access direct feedback due to privacy, hardware or computational constraints.
arXiv Detail & Related papers (2024-12-18T07:20:33Z) - Simulation Based Bayesian Optimization [0.0]
This paper introduces Simulation Based Bayesian Optimization (SBBO) as a novel approach to optimizing acquisition functions.<n>GPs are commonly used as the surrogate model as they offer analytical access to posterior predictive distributions.<n>We demonstrate empirically the effectiveness of SBBO using various choices of surrogate models.
arXiv Detail & Related papers (2024-01-19T16:56:11Z) - Backpropagation of Unrolled Solvers with Folded Optimization [55.04219793298687]
The integration of constrained optimization models as components in deep networks has led to promising advances on many specialized learning tasks.
One typical strategy is algorithm unrolling, which relies on automatic differentiation through the operations of an iterative solver.
This paper provides theoretical insights into the backward pass of unrolled optimization, leading to a system for generating efficiently solvable analytical models of backpropagation.
arXiv Detail & Related papers (2023-01-28T01:50:42Z) - An Empirical Evaluation of Zeroth-Order Optimization Methods on
AI-driven Molecule Optimization [78.36413169647408]
We study the effectiveness of various ZO optimization methods for optimizing molecular objectives.
We show the advantages of ZO sign-based gradient descent (ZO-signGD)
We demonstrate the potential effectiveness of ZO optimization methods on widely used benchmark tasks from the Guacamol suite.
arXiv Detail & Related papers (2022-10-27T01:58:10Z) - Efficient Non-Parametric Optimizer Search for Diverse Tasks [93.64739408827604]
We present the first efficient scalable and general framework that can directly search on the tasks of interest.
Inspired by the innate tree structure of the underlying math expressions, we re-arrange the spaces into a super-tree.
We adopt an adaptation of the Monte Carlo method to tree search, equipped with rejection sampling and equivalent- form detection.
arXiv Detail & Related papers (2022-09-27T17:51:31Z) - Sparse high-dimensional linear regression with a partitioned empirical
Bayes ECM algorithm [62.997667081978825]
We propose a computationally efficient and powerful Bayesian approach for sparse high-dimensional linear regression.
Minimal prior assumptions on the parameters are used through the use of plug-in empirical Bayes estimates.
The proposed approach is implemented in the R package probe.
arXiv Detail & Related papers (2022-09-16T19:15:50Z) - Optimistic Optimization of Gaussian Process Samples [30.226274682578172]
A competing, computationally more efficient, global optimization framework is optimistic optimization, which exploits prior knowledge about the geometry of the search space in form of a dissimilarity function.
We argue that there is a new research domain between geometric and probabilistic search, i.e. methods that run drastically faster than traditional Bayesian optimization, while retaining some of the crucial functionality of Bayesian optimization.
arXiv Detail & Related papers (2022-09-02T09:06:24Z) - Surrogate modeling for Bayesian optimization beyond a single Gaussian
process [62.294228304646516]
We propose a novel Bayesian surrogate model to balance exploration with exploitation of the search space.
To endow function sampling with scalability, random feature-based kernel approximation is leveraged per GP model.
To further establish convergence of the proposed EGP-TS to the global optimum, analysis is conducted based on the notion of Bayesian regret.
arXiv Detail & Related papers (2022-05-27T16:43:10Z) - Towards Learning Universal Hyperparameter Optimizers with Transformers [57.35920571605559]
We introduce the OptFormer, the first text-based Transformer HPO framework that provides a universal end-to-end interface for jointly learning policy and function prediction.
Our experiments demonstrate that the OptFormer can imitate at least 7 different HPO algorithms, which can be further improved via its function uncertainty estimates.
arXiv Detail & Related papers (2022-05-26T12:51:32Z) - Optimizer Amalgamation [124.33523126363728]
We are motivated to study a new problem named Amalgamation: how can we best combine a pool of "teacher" amalgamations into a single "student" that can have stronger problem-specific performance?
First, we define three differentiable mechanisms to amalgamate a pool of analyticals by gradient descent.
In order to reduce variance of the process, we also explore methods to stabilize the process by perturbing the target.
arXiv Detail & Related papers (2022-03-12T16:07:57Z) - Scalable Bayesian Optimization Using Vecchia Approximations of Gaussian
Processes [0.0]
We adapt the Vecchia approximation, a popular GP approximation from spatial statistics, to enable scalable high-dimensional Bayesian optimization.
We focus on the use of our warped Vecchia GP in trust-region Bayesian optimization via Thompson sampling.
arXiv Detail & Related papers (2022-03-02T23:55:14Z) - Fourier Representations for Black-Box Optimization over Categorical
Variables [34.0277529502051]
We propose to use existing methods in conjunction with a surrogate model for the black-box evaluations over purely categorical variables.
To learn such representations, we consider two different settings to update our surrogate model.
Numerical experiments over synthetic benchmarks as well as real-world RNA sequence optimization and design problems demonstrate the representational power of the proposed methods.
arXiv Detail & Related papers (2022-02-08T08:14:58Z) - Triangulation candidates for Bayesian optimization [0.3222802562733786]
Bayesian optimization is a form of design to idealize input-output relationships with a suitably flexible regression model.
Here we propose using candidates based a Delaunay triangulation, based on a simple conventional convex library.
arXiv Detail & Related papers (2021-12-14T15:13:31Z) - Text Counterfactuals via Latent Optimization and Shapley-Guided Search [15.919650185010491]
We study the problem of generating counterfactual text for a classification model.
We aim to minimally alter the text to change the model's prediction.
White-box approaches have been successfully applied to similar problems in vision.
arXiv Detail & Related papers (2021-10-22T05:04:40Z) - Are we Forgetting about Compositional Optimisers in Bayesian
Optimisation? [66.39551991177542]
This paper presents a sample methodology for global optimisation.
Within this, a crucial performance-determiningtrivial is maximising the acquisition function.
We highlight the empirical advantages of the approach to optimise functionation across 3958 individual experiments.
arXiv Detail & Related papers (2020-12-15T12:18:38Z) - An AI-Assisted Design Method for Topology Optimization Without
Pre-Optimized Training Data [68.8204255655161]
An AI-assisted design method based on topology optimization is presented, which is able to obtain optimized designs in a direct way.
Designs are provided by an artificial neural network, the predictor, on the basis of boundary conditions and degree of filling as input data.
arXiv Detail & Related papers (2020-12-11T14:33:27Z) - BOSS: Bayesian Optimization over String Spaces [15.630421177117634]
This article develops a Bayesian optimization (BO) method which acts directly over raw strings.
It proposes the first uses of string kernels and genetic algorithms within BO loops.
arXiv Detail & Related papers (2020-10-02T13:18:27Z) - Global Optimization of Gaussian processes [52.77024349608834]
We propose a reduced-space formulation with trained Gaussian processes trained on few data points.
The approach also leads to significantly smaller and computationally cheaper sub solver for lower bounding.
In total, we reduce time convergence by orders of orders of the proposed method.
arXiv Detail & Related papers (2020-05-21T20:59:11Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.