Related papers: Scattered Forest Search: Smarter Code Space Exploration with LLMs

Scattered Forest Search: Smarter Code Space Exploration with LLMs

URL: http://arxiv.org/abs/2411.05010v1
Date: Tue, 22 Oct 2024 01:58:29 GMT
Title: Scattered Forest Search: Smarter Code Space Exploration with LLMs
Authors: Jonathan Light, Yue Wu, Yiyou Sun, Wenchao Yu, Yanchi liu, Xujiang Zhao, Ziniu Hu, Haifeng Chen, Wei Cheng,
Abstract summary: We introduce Scattered Forest Search to enhance solution diversity while searching for solutions. Experiments on HumanEval, MBPP, APPS, CodeContests, and Leetcode reveal significant performance improvements.
Score: 55.71665969800222
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We propose a novel approach to scaling LLM inference for code generation. We frame code generation as a black box optimization problem within the code space, and employ optimization-inspired techniques to enhance exploration. Specifically, we introduce Scattered Forest Search to enhance solution diversity while searching for solutions. Our theoretical analysis illustrates how these methods avoid local optima during optimization. Extensive experiments on HumanEval, MBPP, APPS, CodeContests, and Leetcode reveal significant performance improvements. For instance, our method achieves a pass@1 rate of 67.1% on HumanEval+ and 87.2% on HumanEval with GPT-3.5, marking improvements of 8.6% and 4.3% over the state-of-the-art, while also halving the iterations needed to find the correct solution. Furthermore, our method scales more efficiently than existing search techniques, including tree search, line search, and repeated sampling.

Related papers

Large Language Model Enhanced Particle Swarm Optimization for Hyperparameter Tuning for Deep Learning Models [2.3949320404005436]
Particle Swarm Optimization and Large Language Models (LLMs) have been individually applied in optimization and deep learning. Our work addresses this gap by integrating LLMs into PSO to reduce model evaluations and improve convergence. Our method speeds up search space exploration by substituting underperforming particle placements with best suggestions.
arXiv Detail & Related papers (2025-04-19T00:54:59Z)
Language-Based Bayesian Optimization Research Assistant (BORA) [0.0]
Contextualizing domain knowledge is a powerful approach to guide searches for fruitful regions. Here, we propose use of Language Models (LLMs) for contextualizing search optimization.
arXiv Detail & Related papers (2025-01-27T17:20:04Z)
RethinkMCTS: Refining Erroneous Thoughts in Monte Carlo Tree Search for Code Generation [65.5353313491402]
We introduce RethinkMCTS, which employs the Monte Carlo Tree Search (MCTS) algorithm to conduct thought-level searches before generating code. We construct verbal feedback from fine-turbo code execution feedback to refine erroneous thoughts during the search. We demonstrate that RethinkMCTS outperforms previous search-based and feedback-based code generation baselines.
arXiv Detail & Related papers (2024-09-15T02:07:28Z)
Search-Based LLMs for Code Optimization [16.843870288512363]
Code written by developers usually suffers from efficiency problems and contain various performance bugs. Recent work regards the task as a sequence generation problem, and resorts to deep learning (DL) techniques such as large language models (LLMs) We propose a search-based LLMs framework named SBLLM that enables iterative refinement and discovery of improved optimization methods.
arXiv Detail & Related papers (2024-08-22T06:59:46Z)
A Three-Stage Algorithm for the Closest String Problem on Artificial and Real Gene Sequences [39.58317527488534]
Closest String Problem is an NP-hard problem that aims to find a string that has the minimum distance from all sequences that belong to the given set of strings. In this paper, we introduce a three-stage algorithm that comprises the following process: first, we apply a novel alphabet pruning method to reduce the search space for effectively finding promising search regions. Second, a variant of beam search to find a solution is employed. This method utilizes a newly developed guiding function based on an expected distance score of partial solutions.
arXiv Detail & Related papers (2024-07-17T21:26:27Z)
Iterative or Innovative? A Problem-Oriented Perspective for Code Optimization [81.88668100203913]
Large language models (LLMs) have demonstrated strong capabilities in solving a wide range of programming tasks. In this paper, we explore code optimization with a focus on performance enhancement, specifically aiming to optimize code for minimal execution time.
arXiv Detail & Related papers (2024-06-17T16:10:10Z)
Discovering Preference Optimization Algorithms with and for Large Language Models [50.843710797024805]
offline preference optimization is a key method for enhancing and controlling the quality of Large Language Model (LLM) outputs. We perform objective discovery to automatically discover new state-of-the-art preference optimization algorithms without (expert) human intervention. Experiments demonstrate the state-of-the-art performance of DiscoPOP, a novel algorithm that adaptively blends logistic and exponential losses.
arXiv Detail & Related papers (2024-06-12T16:58:41Z)
A new approach for solving global optimization and engineering problems based on modified Sea Horse Optimizer [1.84493167882938]
Sea Horse (SHO) is a metaheuristic algorithm that emulates various intelligent behaviors exhibited by sea horses. To mimic the nuanced locomotion of sea horses, SHO integrates the logarithmic helical equation and Levy flight. This study introduces a robust and high-performance variant of the SHO algorithm named mSHO.
arXiv Detail & Related papers (2024-02-21T11:28:00Z)
A GPU-Accelerated Moving-Horizon Algorithm for Training Deep Classification Trees on Large Datasets [13.02279265869312]
We introduce a moving-horizon differential algorithm for classification trees with continuous features (MH-DEOCT) Our approach consists of a discrete tree decoding method that eliminates duplicated searches between adjacent samples. MH-DEOCT achieves near-optimal performance on training and testing.
arXiv Detail & Related papers (2023-11-12T20:34:00Z)
An efficient hybrid classification approach for COVID-19 based on Harris Hawks Optimization and Salp Swarm Optimization [0.0]
This study presents a hybrid binary version of the Harris Hawks Optimization algorithm (HHO) and Salp Swarm Optimization (SSA) for Covid-19 classification. The proposed algorithm (HHOSSA) achieved 96% accuracy with the SVM, 98% and 98% accuracy with two classifiers.
arXiv Detail & Related papers (2022-12-25T19:52:18Z)
Monte Carlo Tree Descent for Black-Box Optimization [10.698553177585973]
We study how to further integrate sample-based descent for faster optimization. We design novel ways of expanding Monte Carlo search trees, with new descent methods at vertices. We show empirically that the proposed algorithms can outperform state-of-the-art methods on many challenging benchmark problems.
arXiv Detail & Related papers (2022-11-01T22:45:10Z)
An Empirical Evaluation of Zeroth-Order Optimization Methods on AI-driven Molecule Optimization [78.36413169647408]
We study the effectiveness of various ZO optimization methods for optimizing molecular objectives. We show the advantages of ZO sign-based gradient descent (ZO-signGD) We demonstrate the potential effectiveness of ZO optimization methods on widely used benchmark tasks from the Guacamol suite.
arXiv Detail & Related papers (2022-10-27T01:58:10Z)
Efficient Non-Parametric Optimizer Search for Diverse Tasks [93.64739408827604]
We present the first efficient scalable and general framework that can directly search on the tasks of interest. Inspired by the innate tree structure of the underlying math expressions, we re-arrange the spaces into a super-tree. We adopt an adaptation of the Monte Carlo method to tree search, equipped with rejection sampling and equivalent- form detection.
arXiv Detail & Related papers (2022-09-27T17:51:31Z)
LassoBench: A High-Dimensional Hyperparameter Optimization Benchmark Suite for Lasso [84.6451154376526]
LassoBench is a new benchmark suite tailored for an important open research topic in the Lasso community. We evaluate 5 state-of-the-art HPO methods and 3 baselines, and demonstrate that Bayesian optimization, in particular, can improve over the methods commonly used for sparse regression.
arXiv Detail & Related papers (2021-11-04T12:05:09Z)
Sub-linear Regret Bounds for Bayesian Optimisation in Unknown Search Spaces [63.22864716473051]
We propose a novel BO algorithm which expands (and shifts) the search space over iterations. We show theoretically that for both our algorithms, the cumulative regret grows at sub-linear rates.
arXiv Detail & Related papers (2020-09-05T14:24:40Z)

This list is automatically generated from the titles and abstracts of the papers in this site.