Possibility Before Utility: Learning And Using Hierarchical Affordances
- URL: http://arxiv.org/abs/2203.12686v1
- Date: Wed, 23 Mar 2022 19:17:22 GMT
- Title: Possibility Before Utility: Learning And Using Hierarchical Affordances
- Authors: Robby Costales and Shariq Iqbal and Fei Sha
- Abstract summary: Reinforcement learning algorithms struggle on tasks with complex hierarchical dependency structures.
We present Hierarchical Affordance Learning (HAL), a method that learns a model of hierarchical affordances in order to prune impossible subtasks for more effective learning.
- Score: 21.556661319375255
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reinforcement learning algorithms struggle on tasks with complex hierarchical
dependency structures. Humans and other intelligent agents do not waste time
assessing the utility of every high-level action in existence, but instead only
consider ones they deem possible in the first place. By focusing only on what
is feasible, or "afforded", at the present moment, an agent can spend more time
both evaluating the utility of and acting on what matters. To this end, we
present Hierarchical Affordance Learning (HAL), a method that learns a model of
hierarchical affordances in order to prune impossible subtasks for more
effective learning. Existing works in hierarchical reinforcement learning
provide agents with structural representations of subtasks but are not
affordance-aware, and by grounding our definition of hierarchical affordances
in the present state, our approach is more flexible than the multitude of
approaches that ground their subtask dependencies in a symbolic history. While
these logic-based methods often require complete knowledge of the subtask
hierarchy, our approach is able to utilize incomplete and varying symbolic
specifications. Furthermore, we demonstrate that relative to
non-affordance-aware methods, HAL agents are better able to efficiently learn
complex tasks, navigate environment stochasticity, and acquire diverse skills
in the absence of extrinsic supervision -- all of which are hallmarks of human
learning.
Related papers
- DIPPER: Direct Preference Optimization to Accelerate Primitive-Enabled Hierarchical Reinforcement Learning [36.50275602760051]
We introduce DIPPER: Direct Preference Optimization to Accelerate Primitive-Enabled Hierarchical Reinforcement Learning.
It is an efficient hierarchical approach that leverages direct preference optimization to learn a higher-level policy and reinforcement learning to learn a lower-level policy.
It enjoys improved computational efficiency due to its use of direct preference optimization instead of standard preference-based approaches.
arXiv Detail & Related papers (2024-06-16T10:49:41Z) - Creating Multi-Level Skill Hierarchies in Reinforcement Learning [0.0]
We propose an answer based on a graphical representation of how the interaction between an agent and its environment may unfold.
Our approach uses modularity maximisation as a central organising principle to expose the structure of the interaction graph at multiple levels of abstraction.
arXiv Detail & Related papers (2023-06-16T17:23:49Z) - Autonomous Open-Ended Learning of Tasks with Non-Stationary
Interdependencies [64.0476282000118]
Intrinsic motivations have proven to generate a task-agnostic signal to properly allocate the training time amongst goals.
While the majority of works in the field of intrinsically motivated open-ended learning focus on scenarios where goals are independent from each other, only few of them studied the autonomous acquisition of interdependent tasks.
In particular, we first deepen the analysis of a previous system, showing the importance of incorporating information about the relationships between tasks at a higher level of the architecture.
Then we introduce H-GRAIL, a new system that extends the previous one by adding a new learning layer to store the autonomously acquired sequences
arXiv Detail & Related papers (2022-05-16T10:43:01Z) - Learning from Guided Play: A Scheduled Hierarchical Approach for
Improving Exploration in Adversarial Imitation Learning [7.51557557629519]
We present Learning from Guided Play (LfGP), a framework in which we leverage expert demonstrations of, in addition to a main task, multiple auxiliary tasks.
This affords many benefits: learning efficiency is improved for main tasks with challenging bottleneck transitions, expert data becomes reusable between tasks, and transfer learning through the reuse of learned auxiliary task models becomes possible.
arXiv Detail & Related papers (2021-12-16T14:58:08Z) - Value Function Spaces: Skill-Centric State Abstractions for Long-Horizon
Reasoning [120.38381203153159]
Reinforcement learning can train policies that effectively perform complex tasks.
For long-horizon tasks, the performance of these methods degrades with horizon, often necessitating reasoning over and composing lower-level skills.
We propose Value Function Spaces: a simple approach that produces such a representation by using the value functions corresponding to each lower-level skill.
arXiv Detail & Related papers (2021-11-04T22:46:16Z) - Hierarchical Skills for Efficient Exploration [70.62309286348057]
In reinforcement learning, pre-trained low-level skills have the potential to greatly facilitate exploration.
Prior knowledge of the downstream task is required to strike the right balance between generality (fine-grained control) and specificity (faster learning) in skill design.
We propose a hierarchical skill learning framework that acquires skills of varying complexity in an unsupervised manner.
arXiv Detail & Related papers (2021-10-20T22:29:32Z) - Example-Driven Model-Based Reinforcement Learning for Solving
Long-Horizon Visuomotor Tasks [85.56153200251713]
We introduce EMBR, a model-based RL method for learning primitive skills that are suitable for completing long-horizon visuomotor tasks.
On a Franka Emika robot arm, we find that EMBR enables the robot to complete three long-horizon visuomotor tasks at 85% success rate.
arXiv Detail & Related papers (2021-09-21T16:48:07Z) - Learning Task Decomposition with Ordered Memory Policy Network [73.3813423684999]
We propose Ordered Memory Policy Network (OMPN) to discover subtask hierarchy by learning from demonstration.
OMPN can be applied to partially observable environments and still achieve higher task decomposition performance.
Our visualization confirms that the subtask hierarchy can emerge in our model.
arXiv Detail & Related papers (2021-03-19T18:13:35Z) - Hierarchical Reinforcement Learning as a Model of Human Task
Interleaving [60.95424607008241]
We develop a hierarchical model of supervisory control driven by reinforcement learning.
The model reproduces known empirical effects of task interleaving.
The results support hierarchical RL as a plausible model of task interleaving.
arXiv Detail & Related papers (2020-01-04T17:53:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.