On the Importance of Feature Decorrelation for Unsupervised
Representation Learning in Reinforcement Learning
- URL: http://arxiv.org/abs/2306.05637v1
- Date: Fri, 9 Jun 2023 02:47:21 GMT
- Title: On the Importance of Feature Decorrelation for Unsupervised
Representation Learning in Reinforcement Learning
- Authors: Hojoon Lee and Koanho Lee and Dongyoon Hwang and Hyunho Lee and
Byungkun Lee and Jaegul Choo
- Abstract summary: unsupervised representation learning (URL) has improved the sample efficiency of Reinforcement Learning (RL)
We propose a novel URL framework that causally predicts future states while increasing the dimension of the latent manifold.
Our framework effectively learns predictive representations without collapse, which significantly improves the sample efficiency of state-of-the-art URL methods on the Atari 100k benchmark.
- Score: 23.876039876806182
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recently, unsupervised representation learning (URL) has improved the sample
efficiency of Reinforcement Learning (RL) by pretraining a model from a large
unlabeled dataset. The underlying principle of these methods is to learn
temporally predictive representations by predicting future states in the latent
space. However, an important challenge of this approach is the representational
collapse, where the subspace of the latent representations collapses into a
low-dimensional manifold. To address this issue, we propose a novel URL
framework that causally predicts future states while increasing the dimension
of the latent manifold by decorrelating the features in the latent space.
Through extensive empirical studies, we demonstrate that our framework
effectively learns predictive representations without collapse, which
significantly improves the sample efficiency of state-of-the-art URL methods on
the Atari 100k benchmark. The code is available at
https://github.com/dojeon-ai/SimTPR.
Related papers
- Idempotent Unsupervised Representation Learning for Skeleton-Based Action Recognition [13.593511876719367]
We propose a novel skeleton-based idempotent generative model (IGM) for unsupervised representation learning.
Our experiments on benchmark datasets, NTU RGB+D and PKUMMD, demonstrate the effectiveness of our proposed method.
arXiv Detail & Related papers (2024-10-27T06:29:04Z) - State Sequences Prediction via Fourier Transform for Representation
Learning [111.82376793413746]
We propose State Sequences Prediction via Fourier Transform (SPF), a novel method for learning expressive representations efficiently.
We theoretically analyze the existence of structural information in state sequences, which is closely related to policy performance and signal regularity.
Experiments demonstrate that the proposed method outperforms several state-of-the-art algorithms in terms of both sample efficiency and performance.
arXiv Detail & Related papers (2023-10-24T14:47:02Z) - Spatial-Temporal Graph Learning with Adversarial Contrastive Adaptation [19.419836274690816]
We propose a new spatial-temporal graph learning model (GraphST) for enabling effective self-supervised learning.
Our proposed model is an adversarial contrastive learning paradigm that automates the distillation of crucial multi-view self-supervised information.
We demonstrate the superiority of our proposed GraphST method in various spatial-temporal prediction tasks on real-life datasets.
arXiv Detail & Related papers (2023-06-19T03:09:35Z) - Understanding Self-Predictive Learning for Reinforcement Learning [61.62067048348786]
We study the learning dynamics of self-predictive learning for reinforcement learning.
We propose a novel self-predictive algorithm that learns two representations simultaneously.
arXiv Detail & Related papers (2022-12-06T20:43:37Z) - Value-Consistent Representation Learning for Data-Efficient
Reinforcement Learning [105.70602423944148]
We propose a novel method, called value-consistent representation learning (VCR), to learn representations that are directly related to decision-making.
Instead of aligning this imagined state with a real state returned by the environment, VCR applies a $Q$-value head on both states and obtains two distributions of action values.
It has been demonstrated that our methods achieve new state-of-the-art performance for search-free RL algorithms.
arXiv Detail & Related papers (2022-06-25T03:02:25Z) - Bayesian Graph Contrastive Learning [55.36652660268726]
We propose a novel perspective of graph contrastive learning methods showing random augmentations leads to encoders.
Our proposed method represents each node by a distribution in the latent space in contrast to existing techniques which embed each node to a deterministic vector.
We show a considerable improvement in performance compared to existing state-of-the-art methods on several benchmark datasets.
arXiv Detail & Related papers (2021-12-15T01:45:32Z) - An Effective Baseline for Robustness to Distributional Shift [5.627346969563955]
Refraining from confidently predicting when faced with categories of inputs different from those seen during training is an important requirement for the safe deployment of deep learning systems.
We present a simple, but highly effective approach to deal with out-of-distribution detection that uses the principle of abstention.
arXiv Detail & Related papers (2021-05-15T00:46:11Z) - Relation-Guided Representation Learning [53.60351496449232]
We propose a new representation learning method that explicitly models and leverages sample relations.
Our framework well preserves the relations between samples.
By seeking to embed samples into subspace, we show that our method can address the large-scale and out-of-sample problem.
arXiv Detail & Related papers (2020-07-11T10:57:45Z) - Value-driven Hindsight Modelling [68.658900923595]
Value estimation is a critical component of the reinforcement learning (RL) paradigm.
Model learning can make use of the rich transition structure present in sequences of observations, but this approach is usually not sensitive to the reward function.
We develop an approach for representation learning in RL that sits in between these two extremes.
This provides tractable prediction targets that are directly relevant for a task, and can thus accelerate learning the value function.
arXiv Detail & Related papers (2020-02-19T18:10:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.