Learning Action-based Representations Using Invariance
- URL: http://arxiv.org/abs/2403.16369v3
- Date: Mon, 24 Jun 2024 13:19:17 GMT
- Title: Learning Action-based Representations Using Invariance
- Authors: Max Rudolph, Caleb Chuck, Kevin Black, Misha Lvovsky, Scott Niekum, Amy Zhang,
- Abstract summary: We introduce action-bisimulation encoding, which learns a multi-step controllability metric that discounts distant state features that are relevant for control.
We demonstrate that action-bisimulation pretraining on reward-free, uniformly random data improves sample efficiency in several environments.
- Score: 18.1941237781348
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Robust reinforcement learning agents using high-dimensional observations must be able to identify relevant state features amidst many exogeneous distractors. A representation that captures controllability identifies these state elements by determining what affects agent control. While methods such as inverse dynamics and mutual information capture controllability for a limited number of timesteps, capturing long-horizon elements remains a challenging problem. Myopic controllability can capture the moment right before an agent crashes into a wall, but not the control-relevance of the wall while the agent is still some distance away. To address this we introduce action-bisimulation encoding, a method inspired by the bisimulation invariance pseudometric, that extends single-step controllability with a recursive invariance constraint. By doing this, action-bisimulation learns a multi-step controllability metric that smoothly discounts distant state features that are relevant for control. We demonstrate that action-bisimulation pretraining on reward-free, uniformly random data improves sample efficiency in several environments, including a photorealistic 3D simulation domain, Habitat. Additionally, we provide theoretical analysis and qualitative results demonstrating the information captured by action-bisimulation.
Related papers
- Learning a Fast Mixing Exogenous Block MDP using a Single Trajectory [87.62730694973696]
STEEL is the first provably sample-efficient algorithm for learning the controllable dynamics of an Exogenous Block Markov Decision Process from a single trajectory.
We prove that STEEL is correct and sample-efficient, and demonstrate STEEL on two toy problems.
arXiv Detail & Related papers (2024-10-03T21:57:21Z) - Sequential Action-Induced Invariant Representation for Reinforcement
Learning [1.2046159151610263]
How to accurately learn task-relevant state representations from high-dimensional observations with visual distractions is a challenging problem in visual reinforcement learning.
We propose a Sequential Action-induced invariant Representation (SAR) method, in which the encoder is optimized by an auxiliary learner to only preserve the components that follow the control signals of sequential actions.
arXiv Detail & Related papers (2023-09-22T05:31:55Z) - Value function estimation using conditional diffusion models for control [62.27184818047923]
We propose a simple algorithm called Diffused Value Function (DVF)
It learns a joint multi-step model of the environment-robot interaction dynamics using a diffusion model.
We show how DVF can be used to efficiently capture the state visitation measure for multiple controllers.
arXiv Detail & Related papers (2023-06-09T18:40:55Z) - Adaptive Discrete Communication Bottlenecks with Dynamic Vector
Quantization [76.68866368409216]
We propose learning to dynamically select discretization tightness conditioned on inputs.
We show that dynamically varying tightness in communication bottlenecks can improve model performance on visual reasoning and reinforcement learning tasks.
arXiv Detail & Related papers (2022-02-02T23:54:26Z) - Efficient Embedding of Semantic Similarity in Control Policies via
Entangled Bisimulation [3.5092955099876266]
Learning generalizeable policies from visual input in the presence of visual distractions is a challenging problem in reinforcement learning.
We propose entangled bisimulation, a bisimulation metric that allows the specification of the distance function between states.
We show how entangled bisimulation can meaningfully improve over previous methods on the Distracting Control Suite (DCS)
arXiv Detail & Related papers (2022-01-28T18:06:06Z) - Is Disentanglement enough? On Latent Representations for Controllable
Music Generation [78.8942067357231]
In the absence of a strong generative decoder, disentanglement does not necessarily imply controllability.
The structure of the latent space with respect to the VAE-decoder plays an important role in boosting the ability of a generative model to manipulate different attributes.
arXiv Detail & Related papers (2021-08-01T18:37:43Z) - Disentangling Action Sequences: Discovering Correlated Samples [6.179793031975444]
We demonstrate the data itself plays a crucial role in disentanglement and instead of the factors, and the disentangled representations align the latent variables with the action sequences.
We propose a novel framework, fractional variational autoencoder (FVAE) to disentangle the action sequences with different significance step-by-step.
Experimental results on dSprites and 3D Chairs show that FVAE improves the stability of disentanglement.
arXiv Detail & Related papers (2020-10-17T07:37:50Z) - Unsupervised Controllable Generation with Self-Training [90.04287577605723]
controllable generation with GANs remains a challenging research problem.
We propose an unsupervised framework to learn a distribution of latent codes that control the generator through self-training.
Our framework exhibits better disentanglement compared to other variants such as the variational autoencoder.
arXiv Detail & Related papers (2020-07-17T21:50:35Z) - Efficient Empowerment Estimation for Unsupervised Stabilization [75.32013242448151]
empowerment principle enables unsupervised stabilization of dynamical systems at upright positions.
We propose an alternative solution based on a trainable representation of a dynamical system as a Gaussian channel.
We show that our method has a lower sample complexity, is more stable in training, possesses the essential properties of the empowerment function, and allows estimation of empowerment from images.
arXiv Detail & Related papers (2020-07-14T21:10:16Z) - Unsupervised Discovery, Control, and Disentanglement of Semantic
Attributes with Applications to Anomaly Detection [15.817227809141116]
We focus on unsupervised generative representations that discover latent factors controlling image semantic attributes.
For (a), we propose a network architecture that exploits the combination of multiscale generative models with mutual information (MI)
For (b), we derive an analytical result (Lemma 1) that brings clarity to two related but distinct concepts.
arXiv Detail & Related papers (2020-02-25T20:50:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.