LLaMA Rider: Spurring Large Language Models to Explore the Open World
- URL: http://arxiv.org/abs/2310.08922v1
- Date: Fri, 13 Oct 2023 07:47:44 GMT
- Title: LLaMA Rider: Spurring Large Language Models to Explore the Open World
- Authors: Yicheng Feng, Yuxuan Wang, Jiazheng Liu, Sipeng Zheng, and Zongqing Lu
- Abstract summary: The capacity of Large Language Models to continuously acquire environmental knowledge and adapt in an open world remains uncertain.
We propose an approach to spur LLMs to explore the open world, gather experiences, and learn to improve their task-solving capabilities.
By evaluation in Minecraft, an open-ended sandbox world, we demonstrate that our approach LLaMA-Rider enhances the efficiency of the LLM in exploring the environment.
- Score: 36.261626047323695
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recently, various studies have leveraged Large Language Models (LLMs) to help
decision-making and planning in environments, and try to align the LLMs'
knowledge with the world conditions. Nonetheless, the capacity of LLMs to
continuously acquire environmental knowledge and adapt in an open world remains
uncertain. In this paper, we propose an approach to spur LLMs to explore the
open world, gather experiences, and learn to improve their task-solving
capabilities. In this approach, a multi-round feedback-revision mechanism is
utilized to encourage LLMs to actively select appropriate revision actions
guided by feedback information from the environment. This facilitates
exploration and enhances the model's performance. Besides, we integrate
sub-task relabeling to assist LLMs in maintaining consistency in sub-task
planning and help the model learn the combinatorial nature between tasks,
enabling it to complete a wider range of tasks through training based on the
acquired exploration experiences. By evaluation in Minecraft, an open-ended
sandbox world, we demonstrate that our approach LLaMA-Rider enhances the
efficiency of the LLM in exploring the environment, and effectively improves
the LLM's ability to accomplish more tasks through fine-tuning with merely 1.3k
instances of collected data, showing minimal training costs compared to the
baseline using reinforcement learning.
Related papers
- From Selection to Generation: A Survey of LLM-based Active Learning [153.8110509961261]
Large Language Models (LLMs) have been employed for generating entirely new data instances and providing more cost-effective annotations.
This survey aims to serve as an up-to-date resource for researchers and practitioners seeking to gain an intuitive understanding of LLM-based AL techniques.
arXiv Detail & Related papers (2025-02-17T12:58:17Z) - Satori: Reinforcement Learning with Chain-of-Action-Thought Enhances LLM Reasoning via Autoregressive Search [57.28671084993782]
Large language models (LLMs) have demonstrated remarkable reasoning capabilities across diverse domains.
Recent studies have shown that increasing test-time computation enhances LLMs' reasoning capabilities.
We propose a two-stage training paradigm: 1) a small-scale format tuning stage to internalize the COAT reasoning format and 2) a large-scale self-improvement stage leveraging reinforcement learning.
arXiv Detail & Related papers (2025-02-04T17:26:58Z) - EVOLvE: Evaluating and Optimizing LLMs For Exploration [76.66831821738927]
Large language models (LLMs) remain under-studied in scenarios requiring optimal decision-making under uncertainty.
We measure LLMs' (in)ability to make optimal decisions in bandits, a state-less reinforcement learning setting relevant to many applications.
Motivated by the existence of optimal exploration algorithms, we propose efficient ways to integrate this algorithmic knowledge into LLMs.
arXiv Detail & Related papers (2024-10-08T17:54:03Z) - Improving Sample Efficiency of Reinforcement Learning with Background Knowledge from Large Language Models [33.504700578933424]
Low sample efficiency is an enduring challenge of reinforcement learning (RL)
We introduce a framework that harnesses large language models to extract background knowledge of an environment.
Our experiments show that these methods achieve significant sample efficiency improvements in a spectrum of downstream tasks.
arXiv Detail & Related papers (2024-07-04T14:33:47Z) - Can LLMs Solve longer Math Word Problems Better? [47.227621867242]
Math Word Problems (MWPs) play a vital role in assessing the capabilities of Large Language Models (LLMs)
The impact of longer contexts on mathematical reasoning remains under-explored.
This study pioneers the investigation of Context Length Generalizability (CoLeG)
arXiv Detail & Related papers (2024-05-23T17:13:50Z) - Reinforcement Learning Problem Solving with Large Language Models [0.0]
Large Language Models (LLMs) have an extensive amount of world knowledge, and this has enabled their application in various domains to improve the performance of Natural Language Processing (NLP) tasks.
This has also facilitated a more accessible paradigm of conversation-based interactions between humans and AI systems to solve intended problems.
We show the practicality of our approach through two detailed case studies for "Research Scientist" and "Legal Matter Intake"
arXiv Detail & Related papers (2024-04-29T12:16:08Z) - Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing [56.75702900542643]
We introduce AlphaLLM for the self-improvements of Large Language Models.
It integrates Monte Carlo Tree Search (MCTS) with LLMs to establish a self-improving loop.
Our experimental results show that AlphaLLM significantly enhances the performance of LLMs without additional annotations.
arXiv Detail & Related papers (2024-04-18T15:21:34Z) - Knowledgeable Agents by Offline Reinforcement Learning from Large Language Model Rollouts [10.929547354171723]
This paper introduces Knowledgeable Agents from Language Model Rollouts (KALM)
It extracts knowledge from large language models (LLMs) in the form of imaginary rollouts that can be easily learned by the agent through offline reinforcement learning methods.
It achieves a success rate of 46% in executing tasks with unseen goals, substantially surpassing the 26% success rate achieved by baseline methods.
arXiv Detail & Related papers (2024-04-14T13:19:40Z) - Small LLMs Are Weak Tool Learners: A Multi-LLM Agent [73.54562551341454]
Large Language Model (LLM) agents significantly extend the capabilities of standalone LLMs.
We propose a novel approach that decomposes the aforementioned capabilities into a planner, caller, and summarizer.
This modular framework facilitates individual updates and the potential use of smaller LLMs for building each capability.
arXiv Detail & Related papers (2024-01-14T16:17:07Z) - Supervised Knowledge Makes Large Language Models Better In-context Learners [94.89301696512776]
Large Language Models (LLMs) exhibit emerging in-context learning abilities through prompt engineering.
The challenge of improving the generalizability and factuality of LLMs in natural language understanding and question answering remains under-explored.
We propose a framework that enhances the reliability of LLMs as it: 1) generalizes out-of-distribution data, 2) elucidates how LLMs benefit from discriminative models, and 3) minimizes hallucinations in generative tasks.
arXiv Detail & Related papers (2023-12-26T07:24:46Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.