LLM Augmented Hierarchical Agents
- URL: http://arxiv.org/abs/2311.05596v1
- Date: Thu, 9 Nov 2023 18:54:28 GMT
- Title: LLM Augmented Hierarchical Agents
- Authors: Bharat Prakash, Tim Oates, Tinoosh Mohsenin
- Abstract summary: Solving long-horizon, temporally-extended tasks using Reinforcement Learning (RL) is challenging, compounded by the common practice of learning without prior knowledge (or tabula rasa learning)
In this paper we exploit the planning capabilities of LLMs while using RL to provide learning from the environment, resulting in a hierarchical agent that uses LLMs to solve long-horizon tasks.
This approach is evaluated in simulation environments such as MiniGrid, SkillHack, and Crafter, and on a real robot arm in block manipulation tasks.
- Score: 4.574041097539858
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Solving long-horizon, temporally-extended tasks using Reinforcement Learning
(RL) is challenging, compounded by the common practice of learning without
prior knowledge (or tabula rasa learning). Humans can generate and execute
plans with temporally-extended actions and quickly learn to perform new tasks
because we almost never solve problems from scratch. We want autonomous agents
to have this same ability. Recently, LLMs have been shown to encode a
tremendous amount of knowledge about the world and to perform impressive
in-context learning and reasoning. However, using LLMs to solve real world
problems is hard because they are not grounded in the current task. In this
paper we exploit the planning capabilities of LLMs while using RL to provide
learning from the environment, resulting in a hierarchical agent that uses LLMs
to solve long-horizon tasks. Instead of completely relying on LLMs, they guide
a high-level policy, making learning significantly more sample efficient. This
approach is evaluated in simulation environments such as MiniGrid, SkillHack,
and Crafter, and on a real robot arm in block manipulation tasks. We show that
agents trained using our approach outperform other baselines methods and, once
trained, don't need access to LLMs during deployment.
Related papers
- WALL-E: World Alignment by Rule Learning Improves World Model-based LLM Agents [55.64361927346957]
We propose a neurosymbolic approach to learn rules gradient-free through large language models (LLMs)
Our embodied LLM agent "WALL-E" is built upon model-predictive control (MPC)
On open-world challenges in Minecraft and ALFWorld, WALL-E achieves higher success rates than existing methods.
arXiv Detail & Related papers (2024-10-09T23:37:36Z) - RePrompt: Planning by Automatic Prompt Engineering for Large Language Models Agents [27.807695570974644]
Large language models (LLMs) have had remarkable success in domains outside the traditional natural language processing.
We propose a novel method, textscRePrompt, which does "gradient descent" to optimize the step-by-step instructions in the prompt of the LLM agents.
arXiv Detail & Related papers (2024-06-17T01:23:11Z) - Towards Efficient LLM Grounding for Embodied Multi-Agent Collaboration [70.09561665520043]
We propose a novel framework for multi-agent collaboration that introduces Reinforced Advantage feedback (ReAd) for efficient self-refinement of plans.
We provide theoretical analysis by extending advantage-weighted regression in reinforcement learning to multi-agent systems.
Experiments on Over-AI and a difficult variant of RoCoBench show that ReAd surpasses baselines in success rate, and also significantly decreases the interaction steps of agents.
arXiv Detail & Related papers (2024-05-23T08:33:19Z) - ST-LLM: Large Language Models Are Effective Temporal Learners [58.79456373423189]
Large Language Models (LLMs) have showcased impressive capabilities in text comprehension and generation.
How to effectively encode and understand videos in video-based dialogue systems remains to be solved.
We propose ST-LLM, an effective video-LLM baseline with spatial-temporal sequence modeling inside LLM.
arXiv Detail & Related papers (2024-03-30T10:11:26Z) - EnvGen: Generating and Adapting Environments via LLMs for Training Embodied Agents [65.38474102119181]
We propose EnvGen, a framework to adaptively create training environments.
We train a small RL agent in a mixture of the original and LLM-generated environments.
We find that a small RL agent trained with EnvGen can outperform SOTA methods, including a GPT-4 agent, and learns long-horizon tasks significantly faster.
arXiv Detail & Related papers (2024-03-18T17:51:16Z) - ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL [80.10358123795946]
We develop a framework for building multi-turn RL algorithms for fine-tuning large language models.
Our framework adopts a hierarchical RL approach and runs two RL algorithms in parallel.
Empirically, we find that ArCHer significantly improves efficiency and performance on agent tasks.
arXiv Detail & Related papers (2024-02-29T18:45:56Z) - Small Models, Big Insights: Leveraging Slim Proxy Models To Decide When and What to Retrieve for LLMs [60.40396361115776]
This paper introduces a novel collaborative approach, namely SlimPLM, that detects missing knowledge in large language models (LLMs) with a slim proxy model.
We employ a proxy model which has far fewer parameters, and take its answers as answers.
Heuristic answers are then utilized to predict the knowledge required to answer the user question, as well as the known and unknown knowledge within the LLM.
arXiv Detail & Related papers (2024-02-19T11:11:08Z) - LgTS: Dynamic Task Sampling using LLM-generated sub-goals for
Reinforcement Learning Agents [10.936460061405157]
We propose LgTS (LLM-guided Teacher-Student learning), a novel approach that explores the planning abilities of LLMs.
Our approach does not assume access to a propreitary or a fine-tuned LLM, nor does it require pre-trained policies that achieve the sub-goals proposed by the LLM.
arXiv Detail & Related papers (2023-10-14T00:07:03Z) - LLMs as Workers in Human-Computational Algorithms? Replicating
Crowdsourcing Pipelines with LLMs [25.4184470735779]
LLMs have shown promise in replicating human-like behavior in crowdsourcing tasks that were previously thought to be exclusive to human abilities.
We explore whether LLMs can replicate more complex crowdsourcing pipelines.
arXiv Detail & Related papers (2023-07-19T17:54:43Z) - Enabling Intelligent Interactions between an Agent and an LLM: A Reinforcement Learning Approach [31.6589518077397]
Large language models (LLMs) encode a vast amount of world knowledge acquired from massive text datasets.
LLMs can assist an embodied agent in solving complex sequential decision making tasks by providing high-level instructions.
We propose When2Ask, a reinforcement learning based approach that learns when it is necessary to query LLMs for high-level instructions.
arXiv Detail & Related papers (2023-06-06T11:49:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.