LB-MCTS: Synergizing Large Language Models and Bayesian Optimization for Efficient CASH
- URL: http://arxiv.org/abs/2601.12355v1
- Date: Sun, 18 Jan 2026 11:26:31 GMT
- Title: LB-MCTS: Synergizing Large Language Models and Bayesian Optimization for Efficient CASH
- Authors: Beicheng Xu, Weitong Qian, Lingching Tung, Yupeng Lu, Bin Cui,
- Abstract summary: We propose LB-MCTS, a framework synergizing Large Language Models (LLMs) and BO within a Monte Carlo Tree Search structure.<n>It maximizes reasoning with Selective Tuning Memory (STM) and explicit exploration-exploitation trade-off.<n>Experiments on 104 AMLB datasets demonstrate the superiority of LB-MCTS over the competitive baselines.
- Score: 10.806107994473495
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: To lower the expertise barrier in machine learning, the AutoML community has focused on the CASH problem, a fundamental challenge that automates the process of algorithm selection and hyperparameter tuning. While traditional methods like Bayesian Optimization (BO) struggle with cold-start issues, Large Language Models (LLMs) can mitigate these via semantic priors. However, existing LLM-based optimizers generalize poorly to the high-dimensional, structured CASH space. We propose LB-MCTS, a framework synergizing LLMs and BO within a Monte Carlo Tree Search structure. It maximizes LLM reasoning with Selective Tuning Memory (STM) and explicit exploration-exploitation trade-off. It combines the strengths of both paradigms by dynamically shifting from LLM-driven to BO-driven proposals as data accumulates. Experiments on 104 AMLB datasets demonstrate the superiority of LB-MCTS over the competitive baselines.
Related papers
- A Meta-Knowledge-Augmented LLM Framework for Hyperparameter Optimization in Time-Series Forecasting [0.0]
We introduce LLM-AutoOpt, a hybrid HPO framework that combines BO with LLM-based contextual reasoning.<n>We show that LLM-AutoOpt achieves improved predictive performance and more interpretable optimization behavior compared to BO and LLM baselines without meta-knowledge.
arXiv Detail & Related papers (2026-02-01T21:26:57Z) - LLM-MemCluster: Empowering Large Language Models with Dynamic Memory for Text Clustering [52.41664454251679]
Large Language Models (LLMs) are reshaping unsupervised learning by offering an unprecedented ability to perform text clustering.<n>Existing methods often rely on complex pipelines with external modules, sacrificing a truly end-to-end approach.<n>We introduce LLM-MemCluster, a novel framework that reconceptualizes clustering as a fully LLM-native task.
arXiv Detail & Related papers (2025-11-19T13:22:08Z) - Pre-trained knowledge elevates large language models beyond traditional chemical reaction optimizers [0.0]
We demonstrate that pre-trained knowledge in large language models (LLMs) fundamentally changes this paradigm.<n>LLM-GO excels precisely where traditional methods struggle: complex categorical spaces requiring domain understanding rather than mathematical optimization.
arXiv Detail & Related papers (2025-08-27T21:09:51Z) - Discrete Tokenization for Multimodal LLMs: A Comprehensive Survey [69.45421620616486]
This work presents the first structured taxonomy and analysis of discrete tokenization methods designed for large language models (LLMs)<n>We categorize 8 representative VQ variants that span classical and modern paradigms and analyze their algorithmic principles, training dynamics, and integration challenges with LLM pipelines.<n>We identify key challenges including codebook collapse, unstable gradient estimation, and modality-specific encoding constraints.
arXiv Detail & Related papers (2025-07-21T10:52:14Z) - LLINBO: Trustworthy LLM-in-the-Loop Bayesian Optimization [5.844783557050259]
Large Language Models (LLMs) have shown remarkable adaptability in low-data regimes.<n>We propose LLINBO: LLM-in-the-Loop BO, a hybrid framework for BO that combines LLMs with statistical surrogate experts.<n>We end the paper with a real-life proof-of-concept in the context of 3D printing.
arXiv Detail & Related papers (2025-05-20T15:54:48Z) - LLM-Lasso: A Robust Framework for Domain-Informed Feature Selection and Regularization [59.75242204923353]
We introduce LLM-Lasso, a framework that leverages large language models (LLMs) to guide feature selection in Lasso regression.<n>LLMs generate penalty factors for each feature, which are converted into weights for the Lasso penalty using a simple, tunable model.<n>Features identified as more relevant by the LLM receive lower penalties, increasing their likelihood of being retained in the final model.
arXiv Detail & Related papers (2025-02-15T02:55:22Z) - Large Language Model-Enhanced Multi-Armed Bandits [43.34246396804588]
Large language models (LLMs) have been adopted to solve sequential decision-making tasks such as multi-armed bandits (MAB)<n>We propose an alternative approach which combines the strengths of classical MAB and LLMs.<n>We conduct empirical evaluations using both synthetic MAB tasks and experiments designed using real-world text datasets.
arXiv Detail & Related papers (2025-02-03T07:19:05Z) - Sequential Large Language Model-Based Hyper-parameter Optimization [0.0]
This study introduces SLLMBO, an innovative framework leveraging large language models (LLMs) for hyper- parameter optimization (HPO)<n>It incorporates dynamic search space adaptability, enhanced parameter space exploitation, and a novel LLM-tree-structured parzen estimator (LLM-TPE) sampler.<n>This comprehensive benchmarking evaluates multiple LLMs, including GPT-3.5-Turbo, GPT-4o, Claude-Sonnet-3.5, and Gemini-1.5-Flash.
arXiv Detail & Related papers (2024-10-27T00:50:30Z) - Read-ME: Refactorizing LLMs as Router-Decoupled Mixture of Experts with System Co-Design [59.00758127310582]
We propose a novel framework Read-ME that transforms pre-trained dense LLMs into smaller MoE models.
Our approach employs activation sparsity to extract experts.
Read-ME outperforms other popular open-source dense models of similar scales.
arXiv Detail & Related papers (2024-10-24T19:48:51Z) - Large Language Model as a Catalyst: A Paradigm Shift in Base Station Siting Optimization [62.16747639440893]
Large language models (LLMs) and their associated technologies advance, particularly in the realms of prompt engineering and agent engineering.<n>Our proposed framework incorporates retrieval-augmented generation (RAG) to enhance the system's ability to acquire domain-specific knowledge and generate solutions.
arXiv Detail & Related papers (2024-08-07T08:43:32Z) - Multi-Reference Preference Optimization for Large Language Models [56.84730239046117]
We introduce a novel closed-form formulation for direct preference optimization using multiple reference models.
The resulting algorithm, Multi-Reference Preference Optimization (MRPO), leverages broader prior knowledge from diverse reference models.
Our experiments demonstrate that LLMs finetuned with MRPO generalize better in various preference data, regardless of data scarcity or abundance.
arXiv Detail & Related papers (2024-05-26T00:29:04Z) - Large Language Models to Enhance Bayesian Optimization [57.474613739645605]
We present LLAMBO, a novel approach that integrates the capabilities of Large Language Models (LLM) within Bayesian optimization.
At a high level, we frame the BO problem in natural language, enabling LLMs to iteratively propose and evaluate promising solutions conditioned on historical evaluations.
Our findings illustrate that LLAMBO is effective at zero-shot warmstarting, and enhances surrogate modeling and candidate sampling, especially in the early stages of search when observations are sparse.
arXiv Detail & Related papers (2024-02-06T11:44:06Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.