Learning Invariant Representations for Reinforcement Learning without
Reconstruction
- URL: http://arxiv.org/abs/2006.10742v2
- Date: Wed, 7 Apr 2021 01:57:14 GMT
- Title: Learning Invariant Representations for Reinforcement Learning without
Reconstruction
- Authors: Amy Zhang, Rowan McAllister, Roberto Calandra, Yarin Gal, Sergey
Levine
- Abstract summary: We study how representation learning can accelerate reinforcement learning from rich observations, such as images, without relying either on domain knowledge or pixel-reconstruction.
Bisimulation metrics quantify behavioral similarity between states in continuous MDPs.
We demonstrate the effectiveness of our method at disregarding task-irrelevant information using modified visual MuJoCo tasks.
- Score: 98.33235415273562
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study how representation learning can accelerate reinforcement learning
from rich observations, such as images, without relying either on domain
knowledge or pixel-reconstruction. Our goal is to learn representations that
both provide for effective downstream control and invariance to task-irrelevant
details. Bisimulation metrics quantify behavioral similarity between states in
continuous MDPs, which we propose using to learn robust latent representations
which encode only the task-relevant information from observations. Our method
trains encoders such that distances in latent space equal bisimulation
distances in state space. We demonstrate the effectiveness of our method at
disregarding task-irrelevant information using modified visual MuJoCo tasks,
where the background is replaced with moving distractors and natural videos,
while achieving SOTA performance. We also test a first-person highway driving
task where our method learns invariance to clouds, weather, and time of day.
Finally, we provide generalization results drawn from properties of
bisimulation metrics, and links to causal inference.
Related papers
- Value Explicit Pretraining for Learning Transferable Representations [11.069853883599102]
We propose a method that learns generalizable representations for transfer reinforcement learning.
We learn new tasks that share similar objectives as previously learned tasks, by learning an encoder for objective-conditioned representations.
Experiments using a realistic navigation simulator and Atari benchmark show that the pretrained encoder produced by our method outperforms current SoTA pretraining methods.
arXiv Detail & Related papers (2023-12-19T17:12:35Z) - Sequential Action-Induced Invariant Representation for Reinforcement
Learning [1.2046159151610263]
How to accurately learn task-relevant state representations from high-dimensional observations with visual distractions is a challenging problem in visual reinforcement learning.
We propose a Sequential Action-induced invariant Representation (SAR) method, in which the encoder is optimized by an auxiliary learner to only preserve the components that follow the control signals of sequential actions.
arXiv Detail & Related papers (2023-09-22T05:31:55Z) - MOCA: Self-supervised Representation Learning by Predicting Masked Online Codebook Assignments [72.6405488990753]
Self-supervised learning can be used for mitigating the greedy needs of Vision Transformer networks.
We propose a single-stage and standalone method, MOCA, which unifies both desired properties.
We achieve new state-of-the-art results on low-shot settings and strong experimental results in various evaluation protocols.
arXiv Detail & Related papers (2023-07-18T15:46:20Z) - SeMAIL: Eliminating Distractors in Visual Imitation via Separated Models [22.472167814814448]
We propose a new model-based imitation learning algorithm named Separated Model-based Adversarial Imitation Learning (SeMAIL)
Our method achieves near-expert performance on various visual control tasks with complex observations and the more challenging tasks with different backgrounds from expert observations.
arXiv Detail & Related papers (2023-06-19T04:33:44Z) - Reinforcement Learning from Passive Data via Latent Intentions [86.4969514480008]
We show that passive data can still be used to learn features that accelerate downstream RL.
Our approach learns from passive data by modeling intentions.
Our experiments demonstrate the ability to learn from many forms of passive data, including cross-embodiment video data and YouTube videos.
arXiv Detail & Related papers (2023-04-10T17:59:05Z) - Constrained Mean Shift for Representation Learning [17.652439157554877]
We develop a non-contrastive representation learning method that can exploit additional knowledge.
Our main idea is to generalize the mean-shift algorithm by constraining the search space of nearest neighbors.
We show that it is possible to use the noisy constraint across modalities to train self-supervised video models.
arXiv Detail & Related papers (2021-10-19T23:14:23Z) - Visual Adversarial Imitation Learning using Variational Models [60.69745540036375]
Reward function specification remains a major impediment for learning behaviors through deep reinforcement learning.
Visual demonstrations of desired behaviors often presents an easier and more natural way to teach agents.
We develop a variational model-based adversarial imitation learning algorithm.
arXiv Detail & Related papers (2021-07-16T00:15:18Z) - PsiPhi-Learning: Reinforcement Learning with Demonstrations using
Successor Features and Inverse Temporal Difference Learning [102.36450942613091]
We propose an inverse reinforcement learning algorithm, called emphinverse temporal difference learning (ITD)
We show how to seamlessly integrate ITD with learning from online environment interactions, arriving at a novel algorithm for reinforcement learning with demonstrations, called $Psi Phi$-learning.
arXiv Detail & Related papers (2021-02-24T21:12:09Z) - Geography-Aware Self-Supervised Learning [79.4009241781968]
We show that due to their different characteristics, a non-trivial gap persists between contrastive and supervised learning on standard benchmarks.
We propose novel training methods that exploit the spatially aligned structure of remote sensing data.
Our experiments show that our proposed method closes the gap between contrastive and supervised learning on image classification, object detection and semantic segmentation for remote sensing.
arXiv Detail & Related papers (2020-11-19T17:29:13Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.