Can Increasing Input Dimensionality Improve Deep Reinforcement Learning?
- URL: http://arxiv.org/abs/2003.01629v2
- Date: Sat, 27 Jun 2020 03:29:14 GMT
- Title: Can Increasing Input Dimensionality Improve Deep Reinforcement Learning?
- Authors: Kei Ota, Tomoaki Oiki, Devesh K. Jha, Toshisada Mariyama, Daniel
Nikovski
- Abstract summary: We propose an online feature extractor network (OFENet) that uses neural nets to produce good representations to be used as inputs to deep RL algorithms.
We show that the RL agents learn more efficiently with the high-dimensional representation than with the lower-dimensional state observations.
- Score: 15.578423102700764
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep reinforcement learning (RL) algorithms have recently achieved remarkable
successes in various sequential decision making tasks, leveraging advances in
methods for training large deep networks. However, these methods usually
require large amounts of training data, which is often a big problem for
real-world applications. One natural question to ask is whether learning good
representations for states and using larger networks helps in learning better
policies. In this paper, we try to study if increasing input dimensionality
helps improve performance and sample efficiency of model-free deep RL
algorithms. To do so, we propose an online feature extractor network (OFENet)
that uses neural nets to produce good representations to be used as inputs to
deep RL algorithms. Even though the high dimensionality of input is usually
supposed to make learning of RL agents more difficult, we show that the RL
agents in fact learn more efficiently with the high-dimensional representation
than with the lower-dimensional state observations. We believe that stronger
feature propagation together with larger networks (and thus larger search
space) allows RL agents to learn more complex functions of states and thus
improves the sample efficiency. Through numerical experiments, we show that the
proposed method outperforms several other state-of-the-art algorithms in terms
of both sample efficiency and performance. Codes for the proposed method are
available at http://www.merl.com/research/license/OFENet .
Related papers
- SHIRE: Enhancing Sample Efficiency using Human Intuition in REinforcement Learning [11.304750795377657]
We propose SHIRE, a framework for encoding human intuition using Probabilistic Graphical Models (PGMs)
SHIRE achieves 25-78% sample efficiency gains across the environments we evaluate at negligible overhead cost.
arXiv Detail & Related papers (2024-09-16T04:46:22Z) - M2CURL: Sample-Efficient Multimodal Reinforcement Learning via Self-Supervised Representation Learning for Robotic Manipulation [0.7564784873669823]
We propose Multimodal Contrastive Unsupervised Reinforcement Learning (M2CURL)
Our approach employs a novel multimodal self-supervised learning technique that learns efficient representations and contributes to faster convergence of RL algorithms.
We evaluate M2CURL on the Tactile Gym 2 simulator and we show that it significantly enhances the learning efficiency in different manipulation tasks.
arXiv Detail & Related papers (2024-01-30T14:09:35Z) - Enhancing data efficiency in reinforcement learning: a novel imagination
mechanism based on mesh information propagation [0.3729614006275886]
We introduce a novel mesh information propagation mechanism, termed the 'Imagination Mechanism (IM)'
IM enables information generated by a single sample to be effectively broadcasted to different states across episodes.
To promote versatility, we extend the IM to function as a plug-and-play module that can be seamlessly and fluidly integrated into other widely adopted RL algorithms.
arXiv Detail & Related papers (2023-09-25T16:03:08Z) - Contrastive UCB: Provably Efficient Contrastive Self-Supervised Learning in Online Reinforcement Learning [92.18524491615548]
Contrastive self-supervised learning has been successfully integrated into the practice of (deep) reinforcement learning (RL)
We study how RL can be empowered by contrastive learning in a class of Markov decision processes (MDPs) and Markov games (MGs) with low-rank transitions.
Under the online setting, we propose novel upper confidence bound (UCB)-type algorithms that incorporate such a contrastive loss with online RL algorithms for MDPs or MGs.
arXiv Detail & Related papers (2022-07-29T17:29:08Z) - Contrastive Learning as Goal-Conditioned Reinforcement Learning [147.28638631734486]
In reinforcement learning (RL), it is easier to solve a task if given a good representation.
While deep RL should automatically acquire such good representations, prior work often finds that learning representations in an end-to-end fashion is unstable.
We show (contrastive) representation learning methods can be cast as RL algorithms in their own right.
arXiv Detail & Related papers (2022-06-15T14:34:15Z) - Scalable Deep Reinforcement Learning Algorithms for Mean Field Games [60.550128966505625]
Mean Field Games (MFGs) have been introduced to efficiently approximate games with very large populations of strategic agents.
Recently, the question of learning equilibria in MFGs has gained momentum, particularly using model-free reinforcement learning (RL) methods.
Existing algorithms to solve MFGs require the mixing of approximated quantities such as strategies or $q$-values.
We propose two methods to address this shortcoming. The first one learns a mixed strategy from distillation of historical data into a neural network and is applied to the Fictitious Play algorithm.
The second one is an online mixing method based on
arXiv Detail & Related papers (2022-03-22T18:10:32Z) - Maximum Entropy Model-based Reinforcement Learning [0.0]
This work connects exploration techniques and model-based reinforcement learning.
We have designed a novel exploration method that takes into account features of the model-based approach.
We also demonstrate through experiments that our method significantly improves the performance of the model-based algorithm Dreamer.
arXiv Detail & Related papers (2021-12-02T13:07:29Z) - Improving Computational Efficiency in Visual Reinforcement Learning via
Stored Embeddings [89.63764845984076]
We present Stored Embeddings for Efficient Reinforcement Learning (SEER)
SEER is a simple modification of existing off-policy deep reinforcement learning methods.
We show that SEER does not degrade the performance of RLizable agents while significantly saving computation and memory.
arXiv Detail & Related papers (2021-03-04T08:14:10Z) - Training Larger Networks for Deep Reinforcement Learning [18.193180866998333]
We show that naively increasing network capacity does not improve performance.
We propose a novel method that consists of 1) wider networks with DenseNet connection, 2) decoupling representation learning from training of RL, and 3) a distributed training method to mitigate overfitting problems.
Using this three-fold technique, we show that we can train very large networks that result in significant performance gains.
arXiv Detail & Related papers (2021-02-16T02:16:54Z) - Learning Dexterous Manipulation from Suboptimal Experts [69.8017067648129]
Relative Entropy Q-Learning (REQ) is a simple policy algorithm that combines ideas from successful offline and conventional RL algorithms.
We show how REQ is also effective for general off-policy RL, offline RL, and RL from demonstrations.
arXiv Detail & Related papers (2020-10-16T18:48:49Z) - Reinforcement Learning with Augmented Data [97.42819506719191]
We present Reinforcement Learning with Augmented Data (RAD), a simple plug-and-play module that can enhance most RL algorithms.
We show that augmentations such as random translate, crop, color jitter, patch cutout, random convolutions, and amplitude scale can enable simple RL algorithms to outperform complex state-of-the-art methods.
arXiv Detail & Related papers (2020-04-30T17:35:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.