Successor Feature Neural Episodic Control
- URL: http://arxiv.org/abs/2111.03110v2
- Date: Wed, 2 Aug 2023 20:21:49 GMT
- Title: Successor Feature Neural Episodic Control
- Authors: David Emukpere, Xavier Alameda-Pineda and Chris Reinke
- Abstract summary: A longstanding goal in reinforcement learning is to build intelligent agents that show fast learning and a flexible transfer of skills akin to humans and animals.
This paper investigates the integration of two frameworks for tackling those goals: episodic control and successor features.
- Score: 17.706998080391635
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: A longstanding goal in reinforcement learning is to build intelligent agents
that show fast learning and a flexible transfer of skills akin to humans and
animals. This paper investigates the integration of two frameworks for tackling
those goals: episodic control and successor features. Episodic control is a
cognitively inspired approach relying on episodic memory, an instance-based
memory model of an agent's experiences. Meanwhile, successor features and
generalized policy improvement (SF&GPI) is a meta and transfer learning
framework allowing to learn policies for tasks that can be efficiently reused
for later tasks which have a different reward function. Individually, these two
techniques have shown impressive results in vastly improving sample efficiency
and the elegant reuse of previously learned policies. Thus, we outline a
combination of both approaches in a single reinforcement learning framework and
empirically illustrate its benefits.
Related papers
- Multi-Agent Transfer Learning via Temporal Contrastive Learning [8.487274986507922]
This paper introduces a novel transfer learning framework for deep multi-agent reinforcement learning.
The approach automatically combines goal-conditioned policies with temporal contrastive learning to discover meaningful sub-goals.
arXiv Detail & Related papers (2024-06-03T14:42:14Z) - Fantastic Rewards and How to Tame Them: A Case Study on Reward Learning
for Task-oriented Dialogue Systems [111.80916118530398]
reinforcement learning (RL) techniques can naturally be utilized to train dialogue strategies to achieve user-specific goals.
This paper aims at answering the question of how to efficiently learn and leverage a reward function for training end-to-end (E2E) ToD agents.
arXiv Detail & Related papers (2023-02-20T22:10:04Z) - Explaining Agent's Decision-making in a Hierarchical Reinforcement
Learning Scenario [0.6643086804649938]
Reinforcement learning is a machine learning approach based on behavioral psychology.
In this work, we make use of the memory-based explainable reinforcement learning method in a hierarchical environment composed of sub-tasks.
arXiv Detail & Related papers (2022-12-14T01:18:45Z) - Basis for Intentions: Efficient Inverse Reinforcement Learning using
Past Experience [89.30876995059168]
inverse reinforcement learning (IRL) -- inferring the reward function of an agent from observing its behavior.
This paper addresses the problem of IRL -- inferring the reward function of an agent from observing its behavior.
arXiv Detail & Related papers (2022-08-09T17:29:49Z) - PsiPhi-Learning: Reinforcement Learning with Demonstrations using
Successor Features and Inverse Temporal Difference Learning [102.36450942613091]
We propose an inverse reinforcement learning algorithm, called emphinverse temporal difference learning (ITD)
We show how to seamlessly integrate ITD with learning from online environment interactions, arriving at a novel algorithm for reinforcement learning with demonstrations, called $Psi Phi$-learning.
arXiv Detail & Related papers (2021-02-24T21:12:09Z) - Behavior Priors for Efficient Reinforcement Learning [97.81587970962232]
We consider how information and architectural constraints can be combined with ideas from the probabilistic modeling literature to learn behavior priors.
We discuss how such latent variable formulations connect to related work on hierarchical reinforcement learning (HRL) and mutual information and curiosity based objectives.
We demonstrate the effectiveness of our framework by applying it to a range of simulated continuous control domains.
arXiv Detail & Related papers (2020-10-27T13:17:18Z) - Rethinking Supervised Learning and Reinforcement Learning in
Task-Oriented Dialogue Systems [58.724629408229205]
We demonstrate how traditional supervised learning and a simulator-free adversarial learning method can be used to achieve performance comparable to state-of-the-art RL-based methods.
Our main goal is not to beat reinforcement learning with supervised learning, but to demonstrate the value of rethinking the role of reinforcement learning and supervised learning in optimizing task-oriented dialogue systems.
arXiv Detail & Related papers (2020-09-21T12:04:18Z) - Importance Weighted Policy Learning and Adaptation [89.46467771037054]
We study a complementary approach which is conceptually simple, general, modular and built on top of recent improvements in off-policy learning.
The framework is inspired by ideas from the probabilistic inference literature and combines robust off-policy learning with a behavior prior.
Our approach achieves competitive adaptation performance on hold-out tasks compared to meta reinforcement learning baselines and can scale to complex sparse-reward scenarios.
arXiv Detail & Related papers (2020-09-10T14:16:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.