Contextual Pre-planning on Reward Machine Abstractions for Enhanced
Transfer in Deep Reinforcement Learning
- URL: http://arxiv.org/abs/2307.05209v4
- Date: Wed, 21 Feb 2024 01:06:35 GMT
- Title: Contextual Pre-planning on Reward Machine Abstractions for Enhanced
Transfer in Deep Reinforcement Learning
- Authors: Guy Azran, Mohamad H. Danesh, Stefano V. Albrecht, Sarah Keren
- Abstract summary: Deep reinforcement learning (DRL) agents tend to overfit to the task on which they were trained and fail to adapt to minor environment changes.
We propose a novel approach to representing the current task using reward machines (RMs)
Our method provides agents with symbolic representations of optimal transitions from their current abstract state and rewards them for achieving these transitions.
- Score: 20.272179949107514
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Recent studies show that deep reinforcement learning (DRL) agents tend to
overfit to the task on which they were trained and fail to adapt to minor
environment changes. To expedite learning when transferring to unseen tasks, we
propose a novel approach to representing the current task using reward machines
(RMs), state machine abstractions that induce subtasks based on the current
task's rewards and dynamics. Our method provides agents with symbolic
representations of optimal transitions from their current abstract state and
rewards them for achieving these transitions. These representations are shared
across tasks, allowing agents to exploit knowledge of previously encountered
symbols and transitions, thus enhancing transfer. Empirical results show that
our representations improve sample efficiency and few-shot transfer in a
variety of domains.
Related papers
- State Representations as Incentives for Reinforcement Learning Agents: A Sim2Real Analysis on Robotic Grasping [3.4777703321218225]
This work examines the effect of various representations in incentivizing the agent to solve a specific robotic task.
A continuum of state representations is defined, starting from hand-crafted numerical states to encoded image-based representations.
The effects of each representation on the ability of the agent to solve the task in simulation and the transferability of the learned policy to the real robot are examined.
arXiv Detail & Related papers (2023-09-21T11:41:22Z) - Generalization in Visual Reinforcement Learning with the Reward Sequence
Distribution [98.67737684075587]
Generalization in partially observed markov decision processes (POMDPs) is critical for successful applications of visual reinforcement learning (VRL)
We propose the reward sequence distribution conditioned on the starting observation and the predefined subsequent action sequence (RSD-OA)
Experiments demonstrate that our representation learning approach based on RSD-OA significantly improves the generalization performance on unseen environments.
arXiv Detail & Related papers (2023-02-19T15:47:24Z) - Investigating the role of model-based learning in exploration and
transfer [11.652741003589027]
In this paper, we investigate transfer learning in the context of model-based agents.
We find that a model-based approach outperforms controlled model-free baselines for transfer learning.
Our results show that intrinsic exploration combined with environment models present a viable direction towards agents that are self-supervised and able to generalize to novel reward functions.
arXiv Detail & Related papers (2023-02-08T11:49:58Z) - Provable Benefits of Representational Transfer in Reinforcement Learning [59.712501044999875]
We study the problem of representational transfer in RL, where an agent first pretrains in a number of source tasks to discover a shared representation.
We show that given generative access to source tasks, we can discover a representation, using which subsequent linear RL techniques quickly converge to a near-optimal policy.
arXiv Detail & Related papers (2022-05-29T04:31:29Z) - Learning Abstract and Transferable Representations for Planning [25.63560394067908]
We propose a framework for autonomously learning state abstractions of an agent's environment.
These abstractions are task-independent, and so can be reused to solve new tasks.
We show how to combine these portable representations with problem-specific ones to generate a sound description of a specific task.
arXiv Detail & Related papers (2022-05-04T14:40:04Z) - High-level Features for Resource Economy and Fast Learning in Skill
Transfer [0.8602553195689513]
Deep networks are proven to be effective due to their ability to form increasingly complex abstractions.
Previous work either enforced formation of abstractions creating a designer bias, or used a large number of neural units.
We propose to exploit neural response dynamics to form compact representations to use in skill transfer.
arXiv Detail & Related papers (2021-06-18T21:05:21Z) - Reinforcement Learning with Prototypical Representations [114.35801511501639]
Proto-RL is a self-supervised framework that ties representation learning with exploration through prototypical representations.
These prototypes simultaneously serve as a summarization of the exploratory experience of an agent as well as a basis for representing observations.
This enables state-of-the-art downstream policy learning on a set of difficult continuous control tasks.
arXiv Detail & Related papers (2021-02-22T18:56:34Z) - Return-Based Contrastive Representation Learning for Reinforcement
Learning [126.7440353288838]
We propose a novel auxiliary task that forces the learnt representations to discriminate state-action pairs with different returns.
Our algorithm outperforms strong baselines on complex tasks in Atari games and DeepMind Control suite.
arXiv Detail & Related papers (2021-02-22T13:04:18Z) - Continuous Transition: Improving Sample Efficiency for Continuous
Control Problems via MixUp [119.69304125647785]
This paper introduces a concise yet powerful method to construct Continuous Transition.
Specifically, we propose to synthesize new transitions for training by linearly interpolating the consecutive transitions.
To keep the constructed transitions authentic, we also develop a discriminator to guide the construction process automatically.
arXiv Detail & Related papers (2020-11-30T01:20:23Z) - Unsupervised Transfer Learning for Spatiotemporal Predictive Networks [90.67309545798224]
We study how to transfer knowledge from a zoo of unsupervisedly learned models towards another network.
Our motivation is that models are expected to understand complex dynamics from different sources.
Our approach yields significant improvements on three benchmarks fortemporal prediction, and benefits the target even from less relevant ones.
arXiv Detail & Related papers (2020-09-24T15:40:55Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.