Related papers: Learning Temporally-Consistent Representations for Data-Efficient Reinforcement Learning

Learning Temporally-Consistent Representations for Data-Efficient Reinforcement Learning

URL: http://arxiv.org/abs/2110.04935v1
Date: Mon, 11 Oct 2021 00:16:43 GMT
Title: Learning Temporally-Consistent Representations for Data-Efficient Reinforcement Learning
Authors: Trevor McInroe, Lukas Sch\"afer, Stefano V. Albrecht
Abstract summary: $k$-Step Latent (KSL) is a representation learning method that enforces temporal consistency of representations. KSL produces encoders that generalize better to new tasks unseen during training.
Score: 3.308743964406687
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Deep reinforcement learning (RL) agents that exist in high-dimensional state spaces, such as those composed of images, have interconnected learning burdens. Agents must learn an action-selection policy that completes their given task, which requires them to learn a representation of the state space that discerns between useful and useless information. The reward function is the only supervised feedback that RL agents receive, which causes a representation learning bottleneck that can manifest in poor sample efficiency. We present $k$-Step Latent (KSL), a new representation learning method that enforces temporal consistency of representations via a self-supervised auxiliary task wherein agents learn to recurrently predict action-conditioned representations of the state space. The state encoder learned by KSL produces low-dimensional representations that make optimization of the RL task more sample efficient. Altogether, KSL produces state-of-the-art results in both data efficiency and asymptotic performance in the popular PlaNet benchmark suite. Our analyses show that KSL produces encoders that generalize better to new tasks unseen during training, and its representations are more strongly tied to reward, are more invariant to perturbations in the state space, and move more smoothly through the temporal axis of the RL problem than other methods such as DrQ, RAD, CURL, and SAC-AE.

Related papers

iQRL -- Implicitly Quantized Representations for Sample-efficient Reinforcement Learning [24.684363928059113]
We propose an efficient representation learning method using only a self-supervised latent-state consistency loss. We achieve high performance and prevent representation collapse by quantizing the latent representation. Our method, named iQRL: implicitly Quantized Reinforcement Learning, is straightforward, compatible with any model-free RL algorithm.
arXiv Detail & Related papers (2024-06-04T18:15:44Z)
Improving Reinforcement Learning Efficiency with Auxiliary Tasks in Non-Visual Environments: A Comparison [0.0]
This study compares common auxiliary tasks based on, to the best of our knowledge, the only decoupled representation learning method for low-dimensional non-visual observations. Our findings show that representation learning with auxiliary tasks only provides performance gains in sufficiently complex environments.
arXiv Detail & Related papers (2023-10-06T13:22:26Z)
TACO: Temporal Latent Action-Driven Contrastive Loss for Visual Reinforcement Learning [73.53576440536682]
We introduce TACO: Temporal Action-driven Contrastive Learning, a powerful temporal contrastive learning approach. TACO simultaneously learns a state and an action representation by optimizing the mutual information between representations of current states. For online RL, TACO achieves 40% performance boost after one million environment interaction steps.
arXiv Detail & Related papers (2023-06-22T22:21:53Z)
Representation Learning in Deep RL via Discrete Information Bottleneck [39.375822469572434]
We study how information bottlenecks can be used to construct latent states efficiently in the presence of task-irrelevant information. We propose architectures that utilize variational and discrete information bottlenecks, coined as RepDIB, to learn structured factorized representations.
arXiv Detail & Related papers (2022-12-28T14:38:12Z)
Mask-based Latent Reconstruction for Reinforcement Learning [58.43247393611453]
Mask-based Latent Reconstruction (MLR) is proposed to predict the complete state representations in the latent space from the observations with spatially and temporally masked pixels. Extensive experiments show that our MLR significantly improves the sample efficiency in deep reinforcement learning.
arXiv Detail & Related papers (2022-01-28T13:07:11Z)
Exploratory State Representation Learning [63.942632088208505]
We propose a new approach called XSRL (eXploratory State Representation Learning) to solve the problems of exploration and SRL in parallel. On one hand, it jointly learns compact state representations and a state transition estimator which is used to remove unexploitable information from the representations. On the other hand, it continuously trains an inverse model, and adds to the prediction error of this model a $k$-step learning progress bonus to form the objective of a discovery policy.
arXiv Detail & Related papers (2021-09-28T10:11:07Z)
Provably Efficient Representation Selection in Low-rank Markov Decision Processes: From Online to Offline RL [84.14947307790361]
We propose an efficient algorithm, called ReLEX, for representation learning in both online and offline reinforcement learning. We show that the online version of ReLEX, called Re-UCB, always performs no worse than the state-of-the-art algorithm without representation selection. For the offline counterpart, ReLEX-LCB, we show that the algorithm can find the optimal policy if the representation class can cover the state-action space.
arXiv Detail & Related papers (2021-06-22T17:16:50Z)
Reinforcement Learning with Prototypical Representations [114.35801511501639]
Proto-RL is a self-supervised framework that ties representation learning with exploration through prototypical representations. These prototypes simultaneously serve as a summarization of the exploratory experience of an agent as well as a basis for representing observations. This enables state-of-the-art downstream policy learning on a set of difficult continuous control tasks.
arXiv Detail & Related papers (2021-02-22T18:56:34Z)
Return-Based Contrastive Representation Learning for Reinforcement Learning [126.7440353288838]
We propose a novel auxiliary task that forces the learnt representations to discriminate state-action pairs with different returns. Our algorithm outperforms strong baselines on complex tasks in Atari games and DeepMind Control suite.
arXiv Detail & Related papers (2021-02-22T13:04:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.