Uniform State Abstraction For Reinforcement Learning
- URL: http://arxiv.org/abs/2004.02919v1
- Date: Mon, 6 Apr 2020 18:13:08 GMT
- Title: Uniform State Abstraction For Reinforcement Learning
- Authors: John Burden and Daniel Kudenko
- Abstract summary: MultiGrid Reinforcement Learning (MRL) has shown that abstract knowledge in the form of a potential function can be learned almost solely from agent interaction with the environment.
In this paper we extend and improve MRL to take advantage of modern Deep Learning algorithms such as Deep Q-Networks (DQN)
- Score: 6.624726878647541
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Potential Based Reward Shaping combined with a potential function based on
appropriately defined abstract knowledge has been shown to significantly
improve learning speed in Reinforcement Learning. MultiGrid Reinforcement
Learning (MRL) has further shown that such abstract knowledge in the form of a
potential function can be learned almost solely from agent interaction with the
environment. However, we show that MRL faces the problem of not extending well
to work with Deep Learning. In this paper we extend and improve MRL to take
advantage of modern Deep Learning algorithms such as Deep Q-Networks (DQN). We
show that DQN augmented with our approach perform significantly better on
continuous control tasks than its Vanilla counterpart and DQN augmented with
MRL.
Related papers
- Continuous Control with Coarse-to-fine Reinforcement Learning [15.585706638252441]
We present a framework that trains RL agents to zoom-into a continuous action space in a coarse-to-fine manner.
We introduce a concrete, value-based algorithm within the framework called Coarse-to-fine Q-Network (CQN)
CQN robustly learns to solve real-world manipulation tasks within a few minutes of online training.
arXiv Detail & Related papers (2024-07-10T16:04:08Z) - Mixture of Experts in a Mixture of RL settings [15.124698782503248]
We show that MoEs can boost Deep Reinforcement Learning (DRL) performance by expanding the network's parameter count while reducing dormant neurons.
We shed more light on MoEs' ability to deal with non-stationarity and investigate MoEs in DRL settings with "amplified" non-stationarity via multi-task training.
arXiv Detail & Related papers (2024-06-26T15:15:15Z) - Lifelong Reinforcement Learning with Modulating Masks [16.24639836636365]
Lifelong learning aims to create AI systems that continuously and incrementally learn during a lifetime, similar to biological learning.
Attempts so far have met problems, including catastrophic forgetting, interference among tasks, and the inability to exploit previous knowledge.
We show that lifelong reinforcement learning with modulating masks is a promising approach to lifelong learning, to the composition of knowledge to learn increasingly complex tasks, and to knowledge reuse for efficient and faster learning.
arXiv Detail & Related papers (2022-12-21T15:49:20Z) - Contrastive UCB: Provably Efficient Contrastive Self-Supervised Learning in Online Reinforcement Learning [92.18524491615548]
Contrastive self-supervised learning has been successfully integrated into the practice of (deep) reinforcement learning (RL)
We study how RL can be empowered by contrastive learning in a class of Markov decision processes (MDPs) and Markov games (MGs) with low-rank transitions.
Under the online setting, we propose novel upper confidence bound (UCB)-type algorithms that incorporate such a contrastive loss with online RL algorithms for MDPs or MGs.
arXiv Detail & Related papers (2022-07-29T17:29:08Z) - Multi-Agent Broad Reinforcement Learning for Intelligent Traffic Light
Control [21.87935026688773]
Existing approaches of Multi-Agent System (MAS) are largely based on Multi-Agent Deep Reinforcement Learning (MADRL)
We propose a Multi-Agent Broad Reinforcement Learning (MABRL) framework to explore the function of BLS in MAS.
arXiv Detail & Related papers (2022-03-08T14:04:09Z) - Mask-based Latent Reconstruction for Reinforcement Learning [58.43247393611453]
Mask-based Latent Reconstruction (MLR) is proposed to predict the complete state representations in the latent space from the observations with spatially and temporally masked pixels.
Extensive experiments show that our MLR significantly improves the sample efficiency in deep reinforcement learning.
arXiv Detail & Related papers (2022-01-28T13:07:11Z) - Rethinking Learning Dynamics in RL using Adversarial Networks [79.56118674435844]
We present a learning mechanism for reinforcement learning of closely related skills parameterized via a skill embedding space.
The main contribution of our work is to formulate an adversarial training regime for reinforcement learning with the help of entropy-regularized policy gradient formulation.
arXiv Detail & Related papers (2022-01-27T19:51:09Z) - Multitask Adaptation by Retrospective Exploration with Learned World
Models [77.34726150561087]
We propose a meta-learned addressing model called RAMa that provides training samples for the MBRL agent taken from task-agnostic storage.
The model is trained to maximize the expected agent's performance by selecting promising trajectories solving prior tasks from the storage.
arXiv Detail & Related papers (2021-10-25T20:02:57Z) - On The Transferability of Deep-Q Networks [6.822707222147354]
Transfer Learning is an efficient machine learning paradigm that allows overcoming some of the hurdles that characterize the successful training of deep neural networks.
While exploiting TL is a well established and successful training practice in Supervised Learning (SL), its applicability in Deep Reinforcement Learning (DRL) is rarer.
In this paper, we study the level of transferability of three different variants of Deep-Q Networks on popular DRL benchmarks and on a set of novel, carefully designed control tasks.
arXiv Detail & Related papers (2021-10-06T10:29:37Z) - Exploratory State Representation Learning [63.942632088208505]
We propose a new approach called XSRL (eXploratory State Representation Learning) to solve the problems of exploration and SRL in parallel.
On one hand, it jointly learns compact state representations and a state transition estimator which is used to remove unexploitable information from the representations.
On the other hand, it continuously trains an inverse model, and adds to the prediction error of this model a $k$-step learning progress bonus to form the objective of a discovery policy.
arXiv Detail & Related papers (2021-09-28T10:11:07Z) - Return-Based Contrastive Representation Learning for Reinforcement
Learning [126.7440353288838]
We propose a novel auxiliary task that forces the learnt representations to discriminate state-action pairs with different returns.
Our algorithm outperforms strong baselines on complex tasks in Atari games and DeepMind Control suite.
arXiv Detail & Related papers (2021-02-22T13:04:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.