I-MCTS: Enhancing Agentic AutoML via Introspective Monte Carlo Tree Search
- URL: http://arxiv.org/abs/2502.14693v1
- Date: Thu, 20 Feb 2025 16:19:09 GMT
- Title: I-MCTS: Enhancing Agentic AutoML via Introspective Monte Carlo Tree Search
- Authors: Zujie Liang, Feng Wei, Wujiang Xu, Lin Chen, Yuxi Qian, Xinhui Wu,
- Abstract summary: Introspective Monte Carlo Tree Search (I-MCTS) is a novel approach that iteratively expands tree nodes through an introspective process.
This facilitates a continuous refinement of the node in the search tree, thereby enhancing the overall decision-making process.
A hybrid rewarding mechanism is implemented to seamlessly transition the Q-value from LLM-estimated scores to actual performance scores.
- Score: 10.718560472954644
- License:
- Abstract: Recent advancements in large language models (LLMs) have shown remarkable potential in automating machine learning tasks. However, existing LLM-based agents often struggle with low-diversity and suboptimal code generation. While recent work has introduced Monte Carlo Tree Search (MCTS) to address these issues, limitations persist in the quality and diversity of thoughts generated, as well as in the scalar value feedback mechanisms used for node selection. In this study, we introduce Introspective Monte Carlo Tree Search (I-MCTS), a novel approach that iteratively expands tree nodes through an introspective process that meticulously analyzes solutions and results from parent and sibling nodes. This facilitates a continuous refinement of the node in the search tree, thereby enhancing the overall decision-making process.Furthermore, we integrate a Large Language Model (LLM)-based value model to facilitate direct evaluation of each node's solution prior to conducting comprehensive computational rollouts. A hybrid rewarding mechanism is implemented to seamlessly transition the Q-value from LLM-estimated scores to actual performance scores. This allows higher-quality nodes to be traversed earlier.Applied to the various ML tasks, our approach demonstrates a6\% absolute improvement in performance compared to the strong open-source AutoML agents, showcasing its effectiveness in enhancing agentic AutoML systems.
Related papers
- Mulberry: Empowering MLLM with o1-like Reasoning and Reflection via Collective Monte Carlo Tree Search [74.46681227410038]
We propose Collective Monte Carlo Tree Search (CoMCTS) for effective and efficient reasoning-path searching and learning.
We construct Mulberry-260k, a multimodal dataset with a tree of rich, explicit and well-defined reasoning nodes for each question.
We perform collective SFT to train our model, Mulberry, a series of MLLMs with o1-like step-by-step Reasoning and Reflection capabilities.
arXiv Detail & Related papers (2024-12-24T10:07:51Z) - Multi-Agent Sampling: Scaling Inference Compute for Data Synthesis with Tree Search-Based Agentic Collaboration [81.45763823762682]
This work aims to bridge the gap by investigating the problem of data synthesis through multi-agent sampling.
We introduce Tree Search-based Orchestrated Agents(TOA), where the workflow evolves iteratively during the sequential sampling process.
Our experiments on alignment, machine translation, and mathematical reasoning demonstrate that multi-agent sampling significantly outperforms single-agent sampling as inference compute scales.
arXiv Detail & Related papers (2024-12-22T15:16:44Z) - Think&Cite: Improving Attributed Text Generation with Self-Guided Tree Search and Progress Reward Modeling [63.98194996746229]
Large language models (LLMs) are prone to hallucination and producing factually incorrect information.
We propose a novel framework, called Think&Cite, and formulate attributed text generation as a multi-step reasoning problem integrated with search.
arXiv Detail & Related papers (2024-12-19T13:55:48Z) - SELA: Tree-Search Enhanced LLM Agents for Automated Machine Learning [14.702694298483445]
Tree-Search Enhanced LLM Agents (SELA) is an agent-based system that leverages Monte Carlo Tree Search (MCTS) to optimize the AutoML process.
SELA represents pipeline configurations as trees, enabling agents to conduct experiments intelligently and iteratively refine their strategies.
In an extensive evaluation across 20 machine learning datasets, we compare the performance of traditional and agent-based AutoML methods.
arXiv Detail & Related papers (2024-10-22T17:56:08Z) - AutoML-Agent: A Multi-Agent LLM Framework for Full-Pipeline AutoML [56.565200973244146]
Automated machine learning (AutoML) accelerates AI development by automating tasks in the development pipeline.
Recent works have started exploiting large language models (LLM) to lessen such burden.
This paper proposes AutoML-Agent, a novel multi-agent framework tailored for full-pipeline AutoML.
arXiv Detail & Related papers (2024-10-03T20:01:09Z) - LLaMA-Berry: Pairwise Optimization for O1-like Olympiad-Level Mathematical Reasoning [56.273799410256075]
The framework combines Monte Carlo Tree Search (MCTS) with iterative Self-Refine to optimize the reasoning path.
The framework has been tested on general and advanced benchmarks, showing superior performance in terms of search efficiency and problem-solving capability.
arXiv Detail & Related papers (2024-10-03T18:12:29Z) - Fleet of Agents: Coordinated Problem Solving with Large Language Models using Genetic Particle Filtering [10.167121757937062]
Fleet of Agents (FoA) is a framework utilizing large language models as agents to navigate through dynamic tree searches.
FoA spawns a multitude of agents, each exploring autonomously, followed by a selection phase.
We experimentally validate FoA using two benchmark tasks, "Game of 24 and "Mini-Crosswords"
arXiv Detail & Related papers (2024-05-07T09:36:23Z) - Monte Carlo Tree Search Boosts Reasoning via Iterative Preference Learning [55.96599486604344]
We introduce an approach aimed at enhancing the reasoning capabilities of Large Language Models (LLMs) through an iterative preference learning process.
We use Monte Carlo Tree Search (MCTS) to iteratively collect preference data, utilizing its look-ahead ability to break down instance-level rewards into more granular step-level signals.
The proposed algorithm employs Direct Preference Optimization (DPO) to update the LLM policy using this newly generated step-level preference data.
arXiv Detail & Related papers (2024-05-01T11:10:24Z) - AQA-Bench: An Interactive Benchmark for Evaluating LLMs' Sequential
Reasoning Ability [29.1826948551409]
AQA-Bench is a novel benchmark to assess the sequential reasoning capabilities of large language models.
We build AQA-Bench with three different algorithms, namely binary search, depth-first search, and breadth-first search.
Our investigations reveal several interesting findings.
arXiv Detail & Related papers (2024-02-14T18:59:33Z) - Robusta: Robust AutoML for Feature Selection via Reinforcement Learning [24.24652530951966]
We propose the first robust AutoML framework, Robusta--based on reinforcement learning (RL)
We show that the framework is able to improve the model robustness by up to 22% while maintaining competitive accuracy on benign samples.
arXiv Detail & Related papers (2021-01-15T03:12:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.