Scattered Forest Search: Smarter Code Space Exploration with LLMs
- URL: http://arxiv.org/abs/2411.05010v1
- Date: Tue, 22 Oct 2024 01:58:29 GMT
- Title: Scattered Forest Search: Smarter Code Space Exploration with LLMs
- Authors: Jonathan Light, Yue Wu, Yiyou Sun, Wenchao Yu, Yanchi liu, Xujiang Zhao, Ziniu Hu, Haifeng Chen, Wei Cheng,
- Abstract summary: We introduce Scattered Forest Search to enhance solution diversity while searching for solutions.
Experiments on HumanEval, MBPP, APPS, CodeContests, and Leetcode reveal significant performance improvements.
- Score: 55.71665969800222
- License:
- Abstract: We propose a novel approach to scaling LLM inference for code generation. We frame code generation as a black box optimization problem within the code space, and employ optimization-inspired techniques to enhance exploration. Specifically, we introduce Scattered Forest Search to enhance solution diversity while searching for solutions. Our theoretical analysis illustrates how these methods avoid local optima during optimization. Extensive experiments on HumanEval, MBPP, APPS, CodeContests, and Leetcode reveal significant performance improvements. For instance, our method achieves a pass@1 rate of 67.1% on HumanEval+ and 87.2% on HumanEval with GPT-3.5, marking improvements of 8.6% and 4.3% over the state-of-the-art, while also halving the iterations needed to find the correct solution. Furthermore, our method scales more efficiently than existing search techniques, including tree search, line search, and repeated sampling.
Related papers
- RethinkMCTS: Refining Erroneous Thoughts in Monte Carlo Tree Search for Code Generation [65.5353313491402]
We introduce RethinkMCTS, which employs the Monte Carlo Tree Search (MCTS) algorithm to conduct thought-level searches before generating code.
We construct verbal feedback from fine-turbo code execution feedback to refine erroneous thoughts during the search.
We demonstrate that RethinkMCTS outperforms previous search-based and feedback-based code generation baselines.
arXiv Detail & Related papers (2024-09-15T02:07:28Z) - Search-Based LLMs for Code Optimization [16.843870288512363]
Code written by developers usually suffers from efficiency problems and contain various performance bugs.
Recent work regards the task as a sequence generation problem, and resorts to deep learning (DL) techniques such as large language models (LLMs)
We propose a search-based LLMs framework named SBLLM that enables iterative refinement and discovery of improved optimization methods.
arXiv Detail & Related papers (2024-08-22T06:59:46Z) - A Three-Stage Algorithm for the Closest String Problem on Artificial and Real Gene Sequences [39.58317527488534]
Closest String Problem is an NP-hard problem that aims to find a string that has the minimum distance from all sequences that belong to the given set of strings.
In this paper, we introduce a three-stage algorithm that comprises the following process: first, we apply a novel alphabet pruning method to reduce the search space for effectively finding promising search regions.
Second, a variant of beam search to find a solution is employed. This method utilizes a newly developed guiding function based on an expected distance score of partial solutions.
arXiv Detail & Related papers (2024-07-17T21:26:27Z) - Iterative or Innovative? A Problem-Oriented Perspective for Code Optimization [81.88668100203913]
Large language models (LLMs) have demonstrated strong capabilities in solving a wide range of programming tasks.
In this paper, we explore code optimization with a focus on performance enhancement, specifically aiming to optimize code for minimal execution time.
arXiv Detail & Related papers (2024-06-17T16:10:10Z) - Discovering Preference Optimization Algorithms with and for Large Language Models [50.843710797024805]
offline preference optimization is a key method for enhancing and controlling the quality of Large Language Model (LLM) outputs.
We perform objective discovery to automatically discover new state-of-the-art preference optimization algorithms without (expert) human intervention.
Experiments demonstrate the state-of-the-art performance of DiscoPOP, a novel algorithm that adaptively blends logistic and exponential losses.
arXiv Detail & Related papers (2024-06-12T16:58:41Z) - A new approach for solving global optimization and engineering problems
based on modified Sea Horse Optimizer [1.84493167882938]
Sea Horse (SHO) is a metaheuristic algorithm that emulates various intelligent behaviors exhibited by sea horses.
To mimic the nuanced locomotion of sea horses, SHO integrates the logarithmic helical equation and Levy flight.
This study introduces a robust and high-performance variant of the SHO algorithm named mSHO.
arXiv Detail & Related papers (2024-02-21T11:28:00Z) - A GPU-Accelerated Moving-Horizon Algorithm for Training Deep
Classification Trees on Large Datasets [13.02279265869312]
We introduce a moving-horizon differential algorithm for classification trees with continuous features (MH-DEOCT)
Our approach consists of a discrete tree decoding method that eliminates duplicated searches between adjacent samples.
MH-DEOCT achieves near-optimal performance on training and testing.
arXiv Detail & Related papers (2023-11-12T20:34:00Z) - Monte Carlo Tree Descent for Black-Box Optimization [10.698553177585973]
We study how to further integrate sample-based descent for faster optimization.
We design novel ways of expanding Monte Carlo search trees, with new descent methods at vertices.
We show empirically that the proposed algorithms can outperform state-of-the-art methods on many challenging benchmark problems.
arXiv Detail & Related papers (2022-11-01T22:45:10Z) - Efficient Non-Parametric Optimizer Search for Diverse Tasks [93.64739408827604]
We present the first efficient scalable and general framework that can directly search on the tasks of interest.
Inspired by the innate tree structure of the underlying math expressions, we re-arrange the spaces into a super-tree.
We adopt an adaptation of the Monte Carlo method to tree search, equipped with rejection sampling and equivalent- form detection.
arXiv Detail & Related papers (2022-09-27T17:51:31Z) - Sub-linear Regret Bounds for Bayesian Optimisation in Unknown Search
Spaces [63.22864716473051]
We propose a novel BO algorithm which expands (and shifts) the search space over iterations.
We show theoretically that for both our algorithms, the cumulative regret grows at sub-linear rates.
arXiv Detail & Related papers (2020-09-05T14:24:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.