Mask-based Latent Reconstruction for Reinforcement Learning
- URL: http://arxiv.org/abs/2201.12096v1
- Date: Fri, 28 Jan 2022 13:07:11 GMT
- Title: Mask-based Latent Reconstruction for Reinforcement Learning
- Authors: Tao Yu, Zhizheng Zhang, Cuiling Lan, Zhibo Chen, Yan Lu
- Abstract summary: Mask-based Latent Reconstruction (MLR) is proposed to predict the complete state representations in the latent space from the observations with spatially and temporally masked pixels.
Extensive experiments show that our MLR significantly improves the sample efficiency in deep reinforcement learning.
- Score: 58.43247393611453
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: For deep reinforcement learning (RL) from pixels, learning effective state
representations is crucial for achieving high performance. However, in
practice, limited experience and high-dimensional input prevent effective
representation learning. To address this, motivated by the success of masked
modeling in other research fields, we introduce mask-based reconstruction to
promote state representation learning in RL. Specifically, we propose a simple
yet effective self-supervised method, Mask-based Latent Reconstruction (MLR),
to predict the complete state representations in the latent space from the
observations with spatially and temporally masked pixels. MLR enables the
better use of context information when learning state representations to make
them more informative, which facilitates RL agent training. Extensive
experiments show that our MLR significantly improves the sample efficiency in
RL and outperforms the state-of-the-art sample-efficient RL methods on multiple
continuous benchmark environments.
Related papers
- DEAR: Disentangled Environment and Agent Representations for Reinforcement Learning without Reconstruction [4.813546138483559]
Reinforcement Learning (RL) algorithms can learn robotic control tasks from visual observations, but they often require a large amount of data.
In this paper, we explore how the agent's knowledge of its shape can improve the sample efficiency of visual RL methods.
We propose a novel method, Disentangled Environment and Agent Representations, that uses the segmentation mask of the agent as supervision.
arXiv Detail & Related papers (2024-06-30T09:15:21Z) - iQRL -- Implicitly Quantized Representations for Sample-efficient Reinforcement Learning [24.684363928059113]
We propose an efficient representation learning method using only a self-supervised latent-state consistency loss.
We achieve high performance and prevent representation collapse by quantizing the latent representation.
Our method, named iQRL: implicitly Quantized Reinforcement Learning, is straightforward, compatible with any model-free RL algorithm.
arXiv Detail & Related papers (2024-06-04T18:15:44Z) - Understanding Masked Autoencoders From a Local Contrastive Perspective [80.57196495601826]
Masked AutoEncoder (MAE) has revolutionized the field of self-supervised learning with its simple yet effective masking and reconstruction strategies.
We introduce a new empirical framework, called Local Contrastive MAE, to analyze both reconstructive and contrastive aspects of MAE.
arXiv Detail & Related papers (2023-10-03T12:08:15Z) - Understanding Masked Autoencoders via Hierarchical Latent Variable
Models [109.35382136147349]
Masked autoencoder (MAE) has recently achieved prominent success in a variety of vision tasks.
Despite the emergence of intriguing empirical observations on MAE, a theoretically principled understanding is still lacking.
arXiv Detail & Related papers (2023-06-08T03:00:10Z) - MA2CL:Masked Attentive Contrastive Learning for Multi-Agent
Reinforcement Learning [128.19212716007794]
We propose an effective framework called textbfMulti-textbfAgent textbfMasked textbfAttentive textbfContrastive textbfLearning (MA2CL)
MA2CL encourages learning representation to be both temporal and agent-level predictive by reconstructing the masked agent observation in latent space.
Our method significantly improves the performance and sample efficiency of different MARL algorithms and outperforms other methods in various vision-based and state-based scenarios.
arXiv Detail & Related papers (2023-06-03T05:32:19Z) - Light-weight probing of unsupervised representations for Reinforcement Learning [20.638410483549706]
We study whether linear probing can be a proxy evaluation task for the quality of unsupervised RL representation.
We show that the probing tasks are strongly rank correlated with the downstream RL performance on the Atari100k Benchmark.
This provides a more efficient method for exploring the space of pretraining algorithms and identifying promising pretraining recipes.
arXiv Detail & Related papers (2022-08-25T21:08:01Z) - Contrastive UCB: Provably Efficient Contrastive Self-Supervised Learning in Online Reinforcement Learning [92.18524491615548]
Contrastive self-supervised learning has been successfully integrated into the practice of (deep) reinforcement learning (RL)
We study how RL can be empowered by contrastive learning in a class of Markov decision processes (MDPs) and Markov games (MGs) with low-rank transitions.
Under the online setting, we propose novel upper confidence bound (UCB)-type algorithms that incorporate such a contrastive loss with online RL algorithms for MDPs or MGs.
arXiv Detail & Related papers (2022-07-29T17:29:08Z) - Learning Temporally-Consistent Representations for Data-Efficient
Reinforcement Learning [3.308743964406687]
$k$-Step Latent (KSL) is a representation learning method that enforces temporal consistency of representations.
KSL produces encoders that generalize better to new tasks unseen during training.
arXiv Detail & Related papers (2021-10-11T00:16:43Z) - Exploratory State Representation Learning [63.942632088208505]
We propose a new approach called XSRL (eXploratory State Representation Learning) to solve the problems of exploration and SRL in parallel.
On one hand, it jointly learns compact state representations and a state transition estimator which is used to remove unexploitable information from the representations.
On the other hand, it continuously trains an inverse model, and adds to the prediction error of this model a $k$-step learning progress bonus to form the objective of a discovery policy.
arXiv Detail & Related papers (2021-09-28T10:11:07Z) - Sample-efficient Reinforcement Learning Representation Learning with
Curiosity Contrastive Forward Dynamics Model [17.41484483119774]
This paper considers a learning framework for Curiosity Contrastive Forward Dynamics Model (CCFDM) in achieving a more sample-efficient reinforcement learning (RL)
CCFDM incorporates a forward dynamics model (FDM) and performs contrastive learning to train its deep convolutional neural network-based image encoder (IE)
During training, CCFDM provides intrinsic rewards, produced based on FDM prediction error, encourages the curiosity of the RL agent to improve exploration.
arXiv Detail & Related papers (2021-03-15T10:08:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.