Tree-of-Code: A Hybrid Approach for Robust Complex Task Planning and Execution
- URL: http://arxiv.org/abs/2412.14212v1
- Date: Wed, 18 Dec 2024 08:47:17 GMT
- Title: Tree-of-Code: A Hybrid Approach for Robust Complex Task Planning and Execution
- Authors: Ziyi Ni, Yifan Li, Daxiang Dong,
- Abstract summary: We propose a novel approach called Tree-of-Code (ToC) to tackle the challenges of complex problem planning and execution with an end-to-end mechanism.
In our framework, each final code execution result is treated as a node in the decision tree, with a breadth-first search strategy employed to explore potential solutions.
- Score: 3.229241113813517
- License:
- Abstract: The exceptional capabilities of large language models (LLMs) have substantially accelerated the rapid rise and widespread adoption of agents. Recent studies have demonstrated that generating Python code to consolidate LLM-based agents' actions into a unified action space (CodeAct) is a promising approach for developing real-world LLM agents. However, this step-by-step code generation approach often lacks consistency and robustness, leading to instability in agent applications, particularly for complex reasoning and out-of-domain tasks. In this paper, we propose a novel approach called Tree-of-Code (ToC) to tackle the challenges of complex problem planning and execution with an end-to-end mechanism. By integrating key ideas from both Tree-of-Thought and CodeAct, ToC combines their strengths to enhance solution exploration. In our framework, each final code execution result is treated as a node in the decision tree, with a breadth-first search strategy employed to explore potential solutions. The final outcome is determined through a voting mechanism based on the outputs of the nodes.
Related papers
- Tree-of-Code: A Tree-Structured Exploring Framework for End-to-End Code Generation and Execution in Complex Task Handling [4.597983734278579]
Tree-of-Code boosts accuracy by nearly 20% over CodeAct with less than 1/4 turns.
Several LLMs even perform better on one-turn CodeProgram than on multi-turn CodeAct.
arXiv Detail & Related papers (2024-12-19T12:31:22Z) - Enhancing LLM Reasoning with Reward-guided Tree Search [95.06503095273395]
o1-like reasoning approach is challenging, and researchers have been making various attempts to advance this open area of research.
We present a preliminary exploration into enhancing the reasoning abilities of LLMs through reward-guided tree search algorithms.
arXiv Detail & Related papers (2024-11-18T16:15:17Z) - CodeTree: Agent-guided Tree Search for Code Generation with Large Language Models [106.11371409170818]
Large language models (LLMs) can act as agents with capabilities to self-refine and improve generated code autonomously.
We propose CodeTree, a framework for LLM agents to efficiently explore the search space in different stages of the code generation process.
Specifically, we adopted a unified tree structure to explicitly explore different coding strategies, generate corresponding coding solutions, and subsequently refine the solutions.
arXiv Detail & Related papers (2024-11-07T00:09:54Z) - DynaSaur: Large Language Agents Beyond Predefined Actions [108.75187263724838]
Existing LLM agent systems typically select actions from a fixed and predefined set at every step.
We propose an LLM agent framework that enables the dynamic creation and composition of actions in an online manner.
Our experiments on the GAIA benchmark demonstrate that this framework offers significantly greater flexibility and outperforms previous methods.
arXiv Detail & Related papers (2024-11-04T02:08:59Z) - ConceptAgent: LLM-Driven Precondition Grounding and Tree Search for Robust Task Planning and Execution [33.252158560173655]
ConceptAgent is a natural language-driven robotic platform designed for task execution in unstructured environments.
We present innovations designed to limit shortcomings, including 1) Predicate Grounding to prevent and recover from infeasible actions, and 2) an embodied version of LLM-guided Monte Carlo Tree Search with self reflection.
arXiv Detail & Related papers (2024-10-08T15:05:40Z) - Unlocking Reasoning Potential in Large Langauge Models by Scaling Code-form Planning [94.76546523689113]
We introduce CodePlan, a framework that generates and follows textcode-form plans -- pseudocode that outlines high-level, structured reasoning processes.
CodePlan effectively captures the rich semantics and control flows inherent to sophisticated reasoning tasks.
It achieves a 25.1% relative improvement compared with directly generating responses.
arXiv Detail & Related papers (2024-09-19T04:13:58Z) - Textualized Agent-Style Reasoning for Complex Tasks by Multiple Round LLM Generation [49.27250832754313]
We present AgentCOT, a llm-based autonomous agent framework.
At each step, AgentCOT selects an action and executes it to yield an intermediate result with supporting evidence.
We introduce two new strategies to enhance the performance of AgentCOT.
arXiv Detail & Related papers (2024-09-19T02:20:06Z) - Can Github issues be solved with Tree Of Thoughts? [0.0]
This research introduces the application of the Tree of Thoughts (ToT) language model reasoning framework for enhancing the decision-making and problem-solving abilities of LLMs.
We experimentally deploy ToT in tackling a Github issue contained within an instance of the SWE-bench.
arXiv Detail & Related papers (2024-05-20T11:05:56Z) - Agents meet OKR: An Object and Key Results Driven Agent System with
Hierarchical Self-Collaboration and Self-Evaluation [25.308341461293857]
OKR-Agent is designed to enhance the capabilities of Large Language Models (LLMs) in task-solving.
Our framework includes two novel modules: hierarchical Objects and Key Results generation and multi-level evaluation.
arXiv Detail & Related papers (2023-11-28T06:16:30Z) - SwiftSage: A Generative Agent with Fast and Slow Thinking for Complex
Interactive Tasks [81.9962823875981]
We introduce SwiftSage, a novel agent framework inspired by the dual-process theory of human cognition.
The framework comprises two primary modules: the Swift module, representing fast and intuitive thinking, and the Sage module, emulating deliberate thought processes.
In 30 tasks from the ScienceWorld benchmark, SwiftSage significantly outperforms other methods such as SayCan, ReAct, and Reflex.
arXiv Detail & Related papers (2023-05-27T07:04:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.