LLM Augmented Hierarchical Agents
- URL: http://arxiv.org/abs/2311.05596v1
- Date: Thu, 9 Nov 2023 18:54:28 GMT
- Title: LLM Augmented Hierarchical Agents
- Authors: Bharat Prakash, Tim Oates, Tinoosh Mohsenin
- Abstract summary: Solving long-horizon, temporally-extended tasks using Reinforcement Learning (RL) is challenging, compounded by the common practice of learning without prior knowledge (or tabula rasa learning)
In this paper we exploit the planning capabilities of LLMs while using RL to provide learning from the environment, resulting in a hierarchical agent that uses LLMs to solve long-horizon tasks.
This approach is evaluated in simulation environments such as MiniGrid, SkillHack, and Crafter, and on a real robot arm in block manipulation tasks.
- Score: 4.574041097539858
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Solving long-horizon, temporally-extended tasks using Reinforcement Learning
(RL) is challenging, compounded by the common practice of learning without
prior knowledge (or tabula rasa learning). Humans can generate and execute
plans with temporally-extended actions and quickly learn to perform new tasks
because we almost never solve problems from scratch. We want autonomous agents
to have this same ability. Recently, LLMs have been shown to encode a
tremendous amount of knowledge about the world and to perform impressive
in-context learning and reasoning. However, using LLMs to solve real world
problems is hard because they are not grounded in the current task. In this
paper we exploit the planning capabilities of LLMs while using RL to provide
learning from the environment, resulting in a hierarchical agent that uses LLMs
to solve long-horizon tasks. Instead of completely relying on LLMs, they guide
a high-level policy, making learning significantly more sample efficient. This
approach is evaluated in simulation environments such as MiniGrid, SkillHack,
and Crafter, and on a real robot arm in block manipulation tasks. We show that
agents trained using our approach outperform other baselines methods and, once
trained, don't need access to LLMs during deployment.
Related papers
- Scaling Autonomous Agents via Automatic Reward Modeling And Planning [52.39395405893965]
Large language models (LLMs) have demonstrated remarkable capabilities across a range of tasks.
However, they still struggle with problems requiring multi-step decision-making and environmental feedback.
We propose a framework that can automatically learn a reward model from the environment without human annotations.
arXiv Detail & Related papers (2025-02-17T18:49:25Z) - Should You Use Your Large Language Model to Explore or Exploit? [55.562545113247666]
We evaluate the ability of large language models to help a decision-making agent facing an exploration-exploitation tradeoff.
We find that while the current LLMs often struggle to exploit, in-context mitigations may be used to substantially improve performance for small-scale tasks.
arXiv Detail & Related papers (2025-01-31T23:42:53Z) - WALL-E: World Alignment by Rule Learning Improves World Model-based LLM Agents [55.64361927346957]
We propose a neurosymbolic approach to learn rules gradient-free through large language models (LLMs)
Our embodied LLM agent "WALL-E" is built upon model-predictive control (MPC)
On open-world challenges in Minecraft and ALFWorld, WALL-E achieves higher success rates than existing methods.
arXiv Detail & Related papers (2024-10-09T23:37:36Z) - StateAct: State Tracking and Reasoning for Acting and Planning with Large Language Models [10.359008237358603]
Planning and acting to solve real' tasks using large language models (LLMs) in interactive environments has become a new frontier for AI methods.
We propose a simple method based on few-shot in-context learning alone to enhance chain-of-thought' with state-tracking.
We show that our method establishes the new state-of-the-art on Alfworld for in-context learning methods.
arXiv Detail & Related papers (2024-09-21T05:54:35Z) - Towards Efficient LLM Grounding for Embodied Multi-Agent Collaboration [70.09561665520043]
We propose a novel framework for multi-agent collaboration that introduces Reinforced Advantage feedback (ReAd) for efficient self-refinement of plans.
We provide theoretical analysis by extending advantage-weighted regression in reinforcement learning to multi-agent systems.
Experiments on Over-AI and a difficult variant of RoCoBench show that ReAd surpasses baselines in success rate, and also significantly decreases the interaction steps of agents.
arXiv Detail & Related papers (2024-05-23T08:33:19Z) - EnvGen: Generating and Adapting Environments via LLMs for Training Embodied Agents [65.38474102119181]
We propose EnvGen, a framework to adaptively create training environments.
We train a small RL agent in a mixture of the original and LLM-generated environments.
We find that a small RL agent trained with EnvGen can outperform SOTA methods, including a GPT-4 agent, and learns long-horizon tasks significantly faster.
arXiv Detail & Related papers (2024-03-18T17:51:16Z) - Small Models, Big Insights: Leveraging Slim Proxy Models To Decide When and What to Retrieve for LLMs [60.40396361115776]
This paper introduces a novel collaborative approach, namely SlimPLM, that detects missing knowledge in large language models (LLMs) with a slim proxy model.
We employ a proxy model which has far fewer parameters, and take its answers as answers.
Heuristic answers are then utilized to predict the knowledge required to answer the user question, as well as the known and unknown knowledge within the LLM.
arXiv Detail & Related papers (2024-02-19T11:11:08Z) - LgTS: Dynamic Task Sampling using LLM-generated sub-goals for
Reinforcement Learning Agents [10.936460061405157]
We propose LgTS (LLM-guided Teacher-Student learning), a novel approach that explores the planning abilities of LLMs.
Our approach does not assume access to a propreitary or a fine-tuned LLM, nor does it require pre-trained policies that achieve the sub-goals proposed by the LLM.
arXiv Detail & Related papers (2023-10-14T00:07:03Z) - LLMs as Workers in Human-Computational Algorithms? Replicating Crowdsourcing Pipelines with LLMs [33.1901850309037]
LLMs have shown promise in replicating human-like behavior in crowdsourcing tasks that were previously thought to be exclusive to human abilities.
We explore whether LLMs can replicate more complex crowdsourcing pipelines.
arXiv Detail & Related papers (2023-07-19T17:54:43Z) - Enabling Intelligent Interactions between an Agent and an LLM: A Reinforcement Learning Approach [31.6589518077397]
Large language models (LLMs) encode a vast amount of world knowledge acquired from massive text datasets.
LLMs can assist an embodied agent in solving complex sequential decision making tasks by providing high-level instructions.
We propose When2Ask, a reinforcement learning based approach that learns when it is necessary to query LLMs for high-level instructions.
arXiv Detail & Related papers (2023-06-06T11:49:09Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.