Reinforcement Learning, Bit by Bit
- URL: http://arxiv.org/abs/2103.04047v8
- Date: Thu, 4 May 2023 20:53:30 GMT
- Title: Reinforcement Learning, Bit by Bit
- Authors: Xiuyuan Lu, Benjamin Van Roy, Vikranth Dwaracherla, Morteza Ibrahimi,
Ian Osband, Zheng Wen
- Abstract summary: Reinforcement learning agents have demonstrated remarkable achievements in simulated environments.
Data efficiency poses an impediment to carrying this success over to real environments.
We discuss concepts and regret analysis that together offer principled guidance.
- Score: 27.66567077899924
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Reinforcement learning agents have demonstrated remarkable achievements in
simulated environments. Data efficiency poses an impediment to carrying this
success over to real environments. The design of data-efficient agents calls
for a deeper understanding of information acquisition and representation. We
discuss concepts and regret analysis that together offer principled guidance.
This line of thinking sheds light on questions of what information to seek, how
to seek that information, and what information to retain. To illustrate
concepts, we design simple agents that build on them and present computational
results that highlight data efficiency.
Related papers
- Leveraging Superfluous Information in Contrastive Representation Learning [0.0]
We show that superfluous information does exist during the conventional contrastive learning framework.
We design a new objective, namely SuperInfo, to learn robust representations by a linear combination of both predictive and superfluous information.
We demonstrate that learning with our loss can often outperform the traditional contrastive learning approaches on image classification, object detection and instance segmentation tasks.
arXiv Detail & Related papers (2024-08-19T16:21:08Z) - Reinforcement Learning from Passive Data via Latent Intentions [86.4969514480008]
We show that passive data can still be used to learn features that accelerate downstream RL.
Our approach learns from passive data by modeling intentions.
Our experiments demonstrate the ability to learn from many forms of passive data, including cross-embodiment video data and YouTube videos.
arXiv Detail & Related papers (2023-04-10T17:59:05Z) - Accelerating exploration and representation learning with offline
pre-training [52.6912479800592]
We show that exploration and representation learning can be improved by separately learning two different models from a single offline dataset.
We show that learning a state representation using noise-contrastive estimation and a model of auxiliary reward can significantly improve the sample efficiency on the challenging NetHack benchmark.
arXiv Detail & Related papers (2023-03-31T18:03:30Z) - Representation Learning in Deep RL via Discrete Information Bottleneck [39.375822469572434]
We study how information bottlenecks can be used to construct latent states efficiently in the presence of task-irrelevant information.
We propose architectures that utilize variational and discrete information bottlenecks, coined as RepDIB, to learn structured factorized representations.
arXiv Detail & Related papers (2022-12-28T14:38:12Z) - Large-Scale Retrieval for Reinforcement Learning [15.372742113152233]
In reinforcement learning, the dominant paradigm is for an agent to amortise information that helps decision-making into its network weights.
Here, we pursue an alternative approach in which agents can utilise large-scale context-sensitive database lookups to support their parametric computations.
arXiv Detail & Related papers (2022-06-10T18:25:30Z) - Information-Theoretic Odometry Learning [83.36195426897768]
We propose a unified information theoretic framework for learning-motivated methods aimed at odometry estimation.
The proposed framework provides an elegant tool for performance evaluation and understanding in information-theoretic language.
arXiv Detail & Related papers (2022-03-11T02:37:35Z) - The Value of Information When Deciding What to Learn [21.945359614094503]
This work builds upon the seminal design principle of information-directed sampling (Russo & Van Roy, 2014)
We offer new insights into learning targets from the literature on rate-distortion theory before turning to empirical results that confirm the value of information when deciding what to learn.
arXiv Detail & Related papers (2021-10-26T19:23:12Z) - Which Mutual-Information Representation Learning Objectives are
Sufficient for Control? [80.2534918595143]
Mutual information provides an appealing formalism for learning representations of data.
This paper formalizes the sufficiency of a state representation for learning and representing the optimal policy.
Surprisingly, we find that two of these objectives can yield insufficient representations given mild and common assumptions on the structure of the MDP.
arXiv Detail & Related papers (2021-06-14T10:12:34Z) - Curious Representation Learning for Embodied Intelligence [81.21764276106924]
Self-supervised representation learning has achieved remarkable success in recent years.
Yet to build truly intelligent agents, we must construct representation learning algorithms that can learn from environments.
We propose a framework, curious representation learning, which jointly learns a reinforcement learning policy and a visual representation model.
arXiv Detail & Related papers (2021-05-03T17:59:20Z) - Reinforcement Learning with Prototypical Representations [114.35801511501639]
Proto-RL is a self-supervised framework that ties representation learning with exploration through prototypical representations.
These prototypes simultaneously serve as a summarization of the exploratory experience of an agent as well as a basis for representing observations.
This enables state-of-the-art downstream policy learning on a set of difficult continuous control tasks.
arXiv Detail & Related papers (2021-02-22T18:56:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.