Agent-Controller Representations: Principled Offline RL with Rich
Exogenous Information
- URL: http://arxiv.org/abs/2211.00164v2
- Date: Mon, 14 Aug 2023 00:16:23 GMT
- Title: Agent-Controller Representations: Principled Offline RL with Rich
Exogenous Information
- Authors: Riashat Islam, Manan Tomar, Alex Lamb, Yonathan Efroni, Hongyu Zang,
Aniket Didolkar, Dipendra Misra, Xin Li, Harm van Seijen, Remi Tachet des
Combes, John Langford
- Abstract summary: Learning to control an agent from data collected offline is vital for real-world applications of reinforcement learning (RL)
This paper introduces offline RL benchmarks offering the ability to study this problem.
We find that contemporary representation learning techniques can fail on datasets where the noise is a complex and time dependent process.
- Score: 49.06422815335159
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Learning to control an agent from data collected offline in a rich
pixel-based visual observation space is vital for real-world applications of
reinforcement learning (RL). A major challenge in this setting is the presence
of input information that is hard to model and irrelevant to controlling the
agent. This problem has been approached by the theoretical RL community through
the lens of exogenous information, i.e, any control-irrelevant information
contained in observations. For example, a robot navigating in busy streets
needs to ignore irrelevant information, such as other people walking in the
background, textures of objects, or birds in the sky. In this paper, we focus
on the setting with visually detailed exogenous information, and introduce new
offline RL benchmarks offering the ability to study this problem. We find that
contemporary representation learning techniques can fail on datasets where the
noise is a complex and time dependent process, which is prevalent in practical
applications. To address these, we propose to use multi-step inverse models,
which have seen a great deal of interest in the RL theory community, to learn
Agent-Controller Representations for Offline-RL (ACRO). Despite being simple
and requiring no reward, we show theoretically and empirically that the
representation created by this objective greatly outperforms baselines.
Related papers
- An Examination of Offline-Trained Encoders in Vision-Based Deep Reinforcement Learning for Autonomous Driving [0.0]
Research investigates the challenges Deep Reinforcement Learning (DRL) faces in Partially Observable Markov Decision Processes (POMDP)
Our research adopts an offline-trained encoder to leverage large video datasets through self-supervised learning to learn generalizable representations.
We show that the features learned by watching BDD100K driving videos can be directly transferred to achieve lane following and collision avoidance in CARLA simulator.
arXiv Detail & Related papers (2024-09-02T14:16:23Z) - D5RL: Diverse Datasets for Data-Driven Deep Reinforcement Learning [99.33607114541861]
We propose a new benchmark for offline RL that focuses on realistic simulations of robotic manipulation and locomotion environments.
Our proposed benchmark covers state-based and image-based domains, and supports both offline RL and online fine-tuning evaluation.
arXiv Detail & Related papers (2024-08-15T22:27:00Z) - Efficient RL via Disentangled Environment and Agent Representations [40.114817446130935]
We propose an approach for learning such structured representations for RL algorithms, using visual knowledge of the agent, such as its shape or mask.
We show that our method, Structured Environment-Agent Representations, outperforms state-of-the-art model-free approaches over 18 different challenging visual simulation environments spanning 5 different robots.
arXiv Detail & Related papers (2023-09-05T17:59:45Z) - Representation Learning in Deep RL via Discrete Information Bottleneck [39.375822469572434]
We study how information bottlenecks can be used to construct latent states efficiently in the presence of task-irrelevant information.
We propose architectures that utilize variational and discrete information bottlenecks, coined as RepDIB, to learn structured factorized representations.
arXiv Detail & Related papers (2022-12-28T14:38:12Z) - Mastering the Unsupervised Reinforcement Learning Benchmark from Pixels [112.63440666617494]
Reinforcement learning algorithms can succeed but require large amounts of interactions between the agent and the environment.
We propose a new method to solve it, using unsupervised model-based RL, for pre-training the agent.
We show robust performance on the Real-Word RL benchmark, hinting at resiliency to environment perturbations during adaptation.
arXiv Detail & Related papers (2022-09-24T14:22:29Z) - INFOrmation Prioritization through EmPOWERment in Visual Model-Based RL [90.06845886194235]
We propose a modified objective for model-based reinforcement learning (RL)
We integrate a term inspired by variational empowerment into a state-space model based on mutual information.
We evaluate the approach on a suite of vision-based robot control tasks with natural video backgrounds.
arXiv Detail & Related papers (2022-04-18T23:09:23Z) - Exploratory State Representation Learning [63.942632088208505]
We propose a new approach called XSRL (eXploratory State Representation Learning) to solve the problems of exploration and SRL in parallel.
On one hand, it jointly learns compact state representations and a state transition estimator which is used to remove unexploitable information from the representations.
On the other hand, it continuously trains an inverse model, and adds to the prediction error of this model a $k$-step learning progress bonus to form the objective of a discovery policy.
arXiv Detail & Related papers (2021-09-28T10:11:07Z) - Offline Reinforcement Learning from Images with Latent Space Models [60.69745540036375]
offline reinforcement learning (RL) refers to the problem of learning policies from a static dataset of environment interactions.
We build on recent advances in model-based algorithms for offline RL, and extend them to high-dimensional visual observation spaces.
Our approach is both tractable in practice and corresponds to maximizing a lower bound of the ELBO in the unknown POMDP.
arXiv Detail & Related papers (2020-12-21T18:28:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.