Learning Rational Subgoals from Demonstrations and Instructions
- URL: http://arxiv.org/abs/2303.05487v1
- Date: Thu, 9 Mar 2023 18:39:22 GMT
- Title: Learning Rational Subgoals from Demonstrations and Instructions
- Authors: Zhezheng Luo, Jiayuan Mao, Jiajun Wu, Tom\'as Lozano-P\'erez, Joshua
B. Tenenbaum, Leslie Pack Kaelbling
- Abstract summary: We present a framework for learning useful subgoals that support efficient long-term planning to achieve novel goals.
At the core of our framework is a collection of rational subgoals (RSGs), which are essentially binary classifiers over the environmental states.
Given a goal description, the learned subgoals and the derived dependencies facilitate off-the-shelf planning algorithms, such as A* and RRT.
- Score: 71.86713748450363
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present a framework for learning useful subgoals that support efficient
long-term planning to achieve novel goals. At the core of our framework is a
collection of rational subgoals (RSGs), which are essentially binary
classifiers over the environmental states. RSGs can be learned from
weakly-annotated data, in the form of unsegmented demonstration trajectories,
paired with abstract task descriptions, which are composed of terms initially
unknown to the agent (e.g., collect-wood then craft-boat then go-across-river).
Our framework also discovers dependencies between RSGs, e.g., the task
collect-wood is a helpful subgoal for the task craft-boat. Given a goal
description, the learned subgoals and the derived dependencies facilitate
off-the-shelf planning algorithms, such as A* and RRT, by setting helpful
subgoals as waypoints to the planner, which significantly improves
performance-time efficiency.
Related papers
- Goal Space Abstraction in Hierarchical Reinforcement Learning via
Set-Based Reachability Analysis [0.5409704301731713]
We introduce a Feudal HRL algorithm that concurrently learns both the goal representation and a hierarchical policy.
We evaluate our approach on complex navigation tasks, showing the learned representation is interpretable, transferrable and results in data efficient learning.
arXiv Detail & Related papers (2023-09-14T12:39:26Z) - Goal Space Abstraction in Hierarchical Reinforcement Learning via
Reachability Analysis [0.0]
We propose a developmental mechanism for subgoal discovery via an emergent representation that abstracts (i.e., groups together) sets of environment states.
We create a HRL algorithm that gradually learns this representation along with the policies and evaluate it on navigation tasks to show the learned representation is interpretable and results in data efficiency.
arXiv Detail & Related papers (2023-09-12T06:53:11Z) - Balancing Exploration and Exploitation in Hierarchical Reinforcement
Learning via Latent Landmark Graphs [31.147969569517286]
Goal-Conditioned Hierarchical Reinforcement Learning (GCHRL) is a promising paradigm to address the exploration-exploitation dilemma in reinforcement learning.
The effectiveness of GCHRL heavily relies on subgoal representation functions and subgoal selection strategy.
This paper proposes HIerarchical reinforcement learning via dynamically building Latent Landmark graphs.
arXiv Detail & Related papers (2023-07-22T12:10:23Z) - Goal-Conditioned Reinforcement Learning with Disentanglement-based
Reachability Planning [14.370384505230597]
We propose a goal-conditioned RL algorithm combined with Disentanglement-based Reachability Planning (REPlan) to solve temporally extended tasks.
Our REPlan significantly outperforms the prior state-of-the-art methods in solving temporally extended tasks.
arXiv Detail & Related papers (2023-07-20T13:08:14Z) - Semantically Aligned Task Decomposition in Multi-Agent Reinforcement
Learning [56.26889258704261]
We propose a novel "disentangled" decision-making method, Semantically Aligned task decomposition in MARL (SAMA)
SAMA prompts pretrained language models with chain-of-thought that can suggest potential goals, provide suitable goal decomposition and subgoal allocation as well as self-reflection-based replanning.
SAMA demonstrates considerable advantages in sample efficiency compared to state-of-the-art ASG methods.
arXiv Detail & Related papers (2023-05-18T10:37:54Z) - Discrete Factorial Representations as an Abstraction for Goal
Conditioned Reinforcement Learning [99.38163119531745]
We show that applying a discretizing bottleneck can improve performance in goal-conditioned RL setups.
We experimentally prove the expected return on out-of-distribution goals, while still allowing for specifying goals with expressive structure.
arXiv Detail & Related papers (2022-11-01T03:31:43Z) - Let Invariant Rationale Discovery Inspire Graph Contrastive Learning [98.10268114789775]
We argue that a high-performing augmentation should preserve the salient semantics of anchor graphs regarding instance-discrimination.
We propose a new framework, Rationale-aware Graph Contrastive Learning (RGCL)
RGCL uses a rationale generator to reveal salient features about graph instance-discrimination as the rationale, and then creates rationale-aware views for contrastive learning.
arXiv Detail & Related papers (2022-06-16T01:28:40Z) - Fast Inference and Transfer of Compositional Task Structures for
Few-shot Task Generalization [101.72755769194677]
We formulate it as a few-shot reinforcement learning problem where a task is characterized by a subtask graph.
Our multi-task subtask graph inferencer (MTSGI) first infers the common high-level task structure in terms of the subtask graph from the training tasks.
Our experiment results on 2D grid-world and complex web navigation domains show that the proposed method can learn and leverage the common underlying structure of the tasks for faster adaptation to the unseen tasks.
arXiv Detail & Related papers (2022-05-25T10:44:25Z) - Meta Reinforcement Learning with Autonomous Inference of Subtask
Dependencies [57.27944046925876]
We propose and address a novel few-shot RL problem, where a task is characterized by a subtask graph.
Instead of directly learning a meta-policy, we develop a Meta-learner with Subtask Graph Inference.
Our experiment results on two grid-world domains and StarCraft II environments show that the proposed method is able to accurately infer the latent task parameter.
arXiv Detail & Related papers (2020-01-01T17:34:00Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.