Hierarchical reinforcement learning with natural language subgoals
- URL: http://arxiv.org/abs/2309.11564v1
- Date: Wed, 20 Sep 2023 18:03:04 GMT
- Title: Hierarchical reinforcement learning with natural language subgoals
- Authors: Arun Ahuja, Kavya Kopparapu, Rob Fergus, Ishita Dasgupta
- Abstract summary: We use data from humans solving tasks to softly supervise the goal space for a set of long range tasks in a 3D embodied environment.
This has two advantages: first, it is easy to generate this data from naive human participants; second, it is flexible enough to represent a vast range of sub-goals in human-relevant tasks.
Our approach outperforms agents that clone expert behavior on these tasks, as well as HRL from scratch without this supervised sub-goal space.
- Score: 26.725710518119044
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Hierarchical reinforcement learning has been a compelling approach for
achieving goal directed behavior over long sequences of actions. However, it
has been challenging to implement in realistic or open-ended environments. A
main challenge has been to find the right space of sub-goals over which to
instantiate a hierarchy. We present a novel approach where we use data from
humans solving these tasks to softly supervise the goal space for a set of long
range tasks in a 3D embodied environment. In particular, we use unconstrained
natural language to parameterize this space. This has two advantages: first, it
is easy to generate this data from naive human participants; second, it is
flexible enough to represent a vast range of sub-goals in human-relevant tasks.
Our approach outperforms agents that clone expert behavior on these tasks, as
well as HRL from scratch without this supervised sub-goal space. Our work
presents a novel approach to combining human expert supervision with the
benefits and flexibility of reinforcement learning.
Related papers
- MENTOR: Guiding Hierarchical Reinforcement Learning with Human Feedback
and Dynamic Distance Constraint [40.3872201560003]
Hierarchical reinforcement learning (HRL) uses a hierarchical framework that divides tasks into subgoals and completes them sequentially.
Current methods struggle to find suitable subgoals for ensuring a stable learning process.
We propose a general hierarchical reinforcement learning framework incorporating human feedback and dynamic distance constraints.
arXiv Detail & Related papers (2024-02-22T03:11:09Z) - Scaling Goal-based Exploration via Pruning Proto-goals [10.976262029859424]
One of the gnarliest challenges in reinforcement learning is exploration that scales to vast domains.
Goal-directed, purposeful behaviours are able to overcome this, but rely on a good goal space.
Our approach explicitly seeks the middle ground, enabling the human designer to specify a vast but meaningful proto-goal space.
arXiv Detail & Related papers (2023-02-09T15:22:09Z) - Discrete Factorial Representations as an Abstraction for Goal
Conditioned Reinforcement Learning [99.38163119531745]
We show that applying a discretizing bottleneck can improve performance in goal-conditioned RL setups.
We experimentally prove the expected return on out-of-distribution goals, while still allowing for specifying goals with expressive structure.
arXiv Detail & Related papers (2022-11-01T03:31:43Z) - Towards an Interpretable Hierarchical Agent Framework using Semantic
Goals [6.677083312952721]
This work introduces an interpretable hierarchical agent framework by combining planning and semantic goal directed reinforcement learning.
We evaluate our framework on a robotic block manipulation task and show that it performs better than other methods.
arXiv Detail & Related papers (2022-10-16T02:04:13Z) - Deep Hierarchical Planning from Pixels [86.14687388689204]
Director is a method for learning hierarchical behaviors directly from pixels by planning inside the latent space of a learned world model.
Despite operating in latent space, the decisions are interpretable because the world model can decode goals into images for visualization.
Director also learns successful behaviors across a wide range of environments, including visual control, Atari games, and DMLab levels.
arXiv Detail & Related papers (2022-06-08T18:20:15Z) - Planning to Practice: Efficient Online Fine-Tuning by Composing Goals in
Latent Space [76.46113138484947]
General-purpose robots require diverse repertoires of behaviors to complete challenging tasks in real-world unstructured environments.
To address this issue, goal-conditioned reinforcement learning aims to acquire policies that can reach goals for a wide range of tasks on command.
We propose Planning to Practice, a method that makes it practical to train goal-conditioned policies for long-horizon tasks.
arXiv Detail & Related papers (2022-05-17T06:58:17Z) - Hierarchical Skills for Efficient Exploration [70.62309286348057]
In reinforcement learning, pre-trained low-level skills have the potential to greatly facilitate exploration.
Prior knowledge of the downstream task is required to strike the right balance between generality (fine-grained control) and specificity (faster learning) in skill design.
We propose a hierarchical skill learning framework that acquires skills of varying complexity in an unsupervised manner.
arXiv Detail & Related papers (2021-10-20T22:29:32Z) - Batch Exploration with Examples for Scalable Robotic Reinforcement
Learning [63.552788688544254]
Batch Exploration with Examples (BEE) explores relevant regions of the state-space guided by a modest number of human provided images of important states.
BEE is able to tackle challenging vision-based manipulation tasks both in simulation and on a real Franka robot.
arXiv Detail & Related papers (2020-10-22T17:49:25Z) - Weakly-Supervised Reinforcement Learning for Controllable Behavior [126.04932929741538]
Reinforcement learning (RL) is a powerful framework for learning to take actions to solve tasks.
In many settings, an agent must winnow down the inconceivably large space of all possible tasks to the single task that it is currently being asked to solve.
We introduce a framework for using weak supervision to automatically disentangle this semantically meaningful subspace of tasks from the enormous space of nonsensical "chaff" tasks.
arXiv Detail & Related papers (2020-04-06T17:50:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.