Alphazero-like Tree-Search can Guide Large Language Model Decoding and
Training
- URL: http://arxiv.org/abs/2309.17179v2
- Date: Fri, 9 Feb 2024 00:13:46 GMT
- Title: Alphazero-like Tree-Search can Guide Large Language Model Decoding and
Training
- Authors: Xidong Feng, Ziyu Wan, Muning Wen, Stephen Marcus McAleer, Ying Wen,
Weinan Zhang, Jun Wang
- Abstract summary: Recent works like Tree-of-Thought (ToT) and Reasoning via Planning (RAP) aim to augment the reasoning capabilities of LLMs.
We present an AlphaZero-like tree-search learning framework for LLMs (termed TS-LLM)
We show how tree-search with a learned value function can guide LLM decoding.
- Score: 37.79247073276239
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent works like Tree-of-Thought (ToT) and Reasoning via Planning (RAP) aim
to augment the reasoning capabilities of LLMs by using tree-search algorithms
to guide multi-step reasoning. These methods rely on prompting a pre-trained
model to serve as a value function and focus on problems with low search depth.
As a result, these methods will not work in domains where the pre-trained LLM
does not have enough knowledge to serve as an effective value function or in
domains that require long-horizon planning. To address these limitations, we
present an AlphaZero-like tree-search learning framework for LLMs (termed
TS-LLM), systematically illustrating how tree-search with a learned value
function can guide LLM decoding. TS-LLM distinguishes itself in two key ways.
(1) Leveraging a learned value function and AlphaZero-like algorithms, our
approach can be generally adaptable to a wide range of tasks, language models
of any size, and tasks of varying search depths. (2) Our approach can guide
LLMs during both inference and training, iteratively improving the LLM.
Empirical results across reasoning, planning, alignment, and decision-making
tasks show that TS-LLM outperforms existing approaches and can handle trees
with a depth of 64.
Related papers
- zsLLMCode: An Effective Approach for Functional Code Embedding via LLM with Zero-Shot Learning [6.976968804436321]
Large language models (LLMs) have the capability of zero-shot learning, which does not require training or fine-tuning.
We propose zsLLMCode, a novel approach that generates functional code embeddings using LLMs.
arXiv Detail & Related papers (2024-09-23T01:03:15Z) - Tree-of-Traversals: A Zero-Shot Reasoning Algorithm for Augmenting Black-box Language Models with Knowledge Graphs [72.89652710634051]
Knowledge graphs (KGs) complement Large Language Models (LLMs) by providing reliable, structured, domain-specific, and up-to-date external knowledge.
We introduce Tree-of-Traversals, a novel zero-shot reasoning algorithm that enables augmentation of black-box LLMs with one or more KGs.
arXiv Detail & Related papers (2024-07-31T06:01:24Z) - LiteSearch: Efficacious Tree Search for LLM [70.29796112457662]
This study introduces a novel guided tree search algorithm with dynamic node selection and node-level exploration budget.
Experiments conducted on the GSM8K and TabMWP datasets demonstrate that our approach enjoys significantly lower computational costs compared to baseline methods.
arXiv Detail & Related papers (2024-06-29T05:14:04Z) - Reasoning on Efficient Knowledge Paths:Knowledge Graph Guides Large Language Model for Domain Question Answering [18.94220625114711]
Large language models (LLMs) perform surprisingly well and outperform human experts on many tasks.
This paper integrates and optimized a pipeline for selecting reasoning paths from KG based on LLM.
We also propose a simple and effective subgraph retrieval method based on chain of thought (CoT) and page rank.
arXiv Detail & Related papers (2024-04-16T08:28:16Z) - RoT: Enhancing Large Language Models with Reflection on Search Trees [41.67536806038573]
We introduce Reflection on search Trees (RoT), an LLM reflection framework designed to improve the performance of tree-search-based prompting methods.
It uses a strong LLM to summarize guidelines from previous tree search experiences to enhance the ability of a weak LLM.
We propose a novel state selection method, which identifies the critical information from historical search processes to help RoT generate more specific and meaningful guidelines.
arXiv Detail & Related papers (2024-04-08T12:31:23Z) - How Can LLM Guide RL? A Value-Based Approach [68.55316627400683]
Reinforcement learning (RL) has become the de facto standard practice for sequential decision-making problems by improving future acting policies with feedback.
Recent developments in large language models (LLMs) have showcased impressive capabilities in language understanding and generation, yet they fall short in exploration and self-improvement capabilities.
We develop an algorithm named LINVIT that incorporates LLM guidance as a regularization factor in value-based RL, leading to significant reductions in the amount of data needed for learning.
arXiv Detail & Related papers (2024-02-25T20:07:13Z) - Efficient Tool Use with Chain-of-Abstraction Reasoning [65.18096363216574]
Large language models (LLMs) need to ground their reasoning to real-world knowledge.
There remains challenges for fine-tuning LLM agents to invoke tools in multi-step reasoning problems.
We propose a new method for LLMs to better leverage tools in multi-step reasoning.
arXiv Detail & Related papers (2024-01-30T21:53:30Z) - Autonomous Tree-search Ability of Large Language Models [58.68735916408101]
Large Language Models have excelled in remarkable reasoning capabilities with advanced prompting techniques.
Recent works propose to utilize external programs to define search logic, such that LLMs can perform passive tree search to solve more challenging reasoning tasks.
We propose a new concept called autonomous tree-search ability of LLM, which can automatically generate a response containing search trajectories for the correct answer.
arXiv Detail & Related papers (2023-10-14T14:14:38Z) - Tree-GPT: Modular Large Language Model Expert System for Forest Remote
Sensing Image Understanding and Interactive Analysis [4.993840366641032]
This paper introduces a novel framework, Tree-GPT, which incorporates Large Language Models (LLMs) into the forestry remote sensing data workflow.
The prototype system performed well, demonstrating the potential for dynamic usage of LLMs in forestry research and environmental sciences.
arXiv Detail & Related papers (2023-10-07T06:12:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.