Related papers: Large Language Models to Enhance Bayesian Optimization

Large Language Models to Enhance Bayesian Optimization

URL: http://arxiv.org/abs/2402.03921v2
Date: Fri, 8 Mar 2024 12:23:56 GMT
Title: Large Language Models to Enhance Bayesian Optimization
Authors: Tennison Liu and Nicol\'as Astorga and Nabeel Seedat and Mihaela van der Schaar
Abstract summary: We present LLAMBO, a novel approach that integrates the capabilities of Large Language Models (LLM) within Bayesian optimization. At a high level, we frame the BO problem in natural language, enabling LLMs to iteratively propose and evaluate promising solutions conditioned on historical evaluations. Our findings illustrate that LLAMBO is effective at zero-shot warmstarting, and enhances surrogate modeling and candidate sampling, especially in the early stages of search when observations are sparse.
Score: 57.474613739645605
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Bayesian optimization (BO) is a powerful approach for optimizing complex and expensive-to-evaluate black-box functions. Its importance is underscored in many applications, notably including hyperparameter tuning, but its efficacy depends on efficiently balancing exploration and exploitation. While there has been substantial progress in BO methods, striking this balance remains a delicate process. In this light, we present LLAMBO, a novel approach that integrates the capabilities of Large Language Models (LLM) within BO. At a high level, we frame the BO problem in natural language, enabling LLMs to iteratively propose and evaluate promising solutions conditioned on historical evaluations. More specifically, we explore how combining contextual understanding, few-shot learning proficiency, and domain knowledge of LLMs can improve model-based BO. Our findings illustrate that LLAMBO is effective at zero-shot warmstarting, and enhances surrogate modeling and candidate sampling, especially in the early stages of search when observations are sparse. Our approach is performed in context and does not require LLM finetuning. Additionally, it is modular by design, allowing individual components to be integrated into existing BO frameworks, or function cohesively as an end-to-end method. We empirically validate LLAMBO's efficacy on the problem of hyperparameter tuning, highlighting strong empirical performance across a range of diverse benchmarks, proprietary, and synthetic tasks.

Related papers

Ensemble Bayesian Inference: Leveraging Small Language Models to Achieve LLM-level Accuracy in Profile Matching Tasks [0.0]
This study explores the potential of small language model(SLM) ensembles to achieve accuracy comparable to proprietary large language models (LLMs) We propose Ensemble Bayesian Inference (EBI), a novel approach that applies Bayesian estimation to combine judgments from multiple SLMs.
arXiv Detail & Related papers (2025-04-24T15:55:10Z)
Optima: Optimizing Effectiveness and Efficiency for LLM-Based Multi-Agent System [75.25394449773052]
Large Language Model (LLM) based multi-agent systems (MAS) show remarkable potential in collaborative problem-solving. Yet they still face critical challenges: low communication efficiency, poor scalability, and a lack of effective parameter-updating optimization methods. We present Optima, a novel framework that addresses these issues by significantly enhancing both communication efficiency and task effectiveness.
arXiv Detail & Related papers (2024-10-10T17:00:06Z)
LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning [56.273799410256075]
The framework combines Monte Carlo Tree Search (MCTS) with iterative Self-Refine to optimize the reasoning path. The framework has been tested on general and advanced benchmarks, showing superior performance in terms of search efficiency and problem-solving capability.
arXiv Detail & Related papers (2024-10-03T18:12:29Z)
Reinforced In-Context Black-Box Optimization [64.25546325063272]
RIBBO is a method to reinforce-learn a BBO algorithm from offline data in an end-to-end fashion. RIBBO employs expressive sequence models to learn the optimization histories produced by multiple behavior algorithms and tasks. Central to our method is to augment the optimization histories with textitregret-to-go tokens, which are designed to represent the performance of an algorithm based on cumulative regret over the future part of the histories.
arXiv Detail & Related papers (2024-02-27T11:32:14Z)
PhaseEvo: Towards Unified In-Context Prompt Optimization for Large Language Models [9.362082187605356]
We present PhaseEvo, an efficient automatic prompt optimization framework that combines the generative capability of LLMs with the global search proficiency of evolution algorithms. PhaseEvo significantly outperforms the state-of-the-art baseline methods by a large margin whilst maintaining good efficiency.
arXiv Detail & Related papers (2024-02-17T17:47:10Z)
Entropy-Regularized Token-Level Policy Optimization for Language Agent Reinforcement [67.1393112206885]
Large Language Models (LLMs) have shown promise as intelligent agents in interactive decision-making tasks. We introduce Entropy-Regularized Token-level Policy Optimization (ETPO), an entropy-augmented RL method tailored for optimizing LLMs at the token level. We assess the effectiveness of ETPO within a simulated environment that models data science code generation as a series of multi-step interactive tasks.
arXiv Detail & Related papers (2024-02-09T07:45:26Z)
Poisson Process for Bayesian Optimization [126.51200593377739]
We propose a ranking-based surrogate model based on the Poisson process and introduce an efficient BO framework, namely Poisson Process Bayesian Optimization (PoPBO) Compared to the classic GP-BO method, our PoPBO has lower costs and better robustness to noise, which is verified by abundant experiments.
arXiv Detail & Related papers (2024-02-05T02:54:50Z)
MoE-LLaVA: Mixture of Experts for Large Vision-Language Models [27.930351465266515]
We propose a simple yet effective training strategy MoE-Tuning for LVLMs. MoE-LLaVA, a MoE-based sparse LVLM architecture, uniquely activates only the top-k experts through routers. Experiments show the significant performance of MoE-LLaVA in a variety of visual understanding and object hallucination benchmarks.
arXiv Detail & Related papers (2024-01-29T08:13:40Z)
High-Dimensional Bayesian Optimization with Sparse Axis-Aligned Subspaces [14.03847432040056]
We argue that a surrogate model defined on sparse axis-aligned subspaces offer an attractive compromise between flexibility and parsimony. We demonstrate that our approach, which relies on Hamiltonian Monte Carlo for inference, can rapidly identify sparse subspaces relevant to modeling the unknown objective function.
arXiv Detail & Related papers (2021-02-27T23:06:24Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.