Privileged Information Dropout in Reinforcement Learning
- URL: http://arxiv.org/abs/2005.09220v1
- Date: Tue, 19 May 2020 05:32:33 GMT
- Title: Privileged Information Dropout in Reinforcement Learning
- Authors: Pierre-Alexandre Kamienny, Kai Arulkumaran, Feryal Behbahani, Wendelin
Boehmer, Shimon Whiteson
- Abstract summary: Using privileged information during training can improve the sample efficiency and performance of machine learning systems.
In this work, we investigate Privileged Information Dropout (pid) for achieving the latter which can be applied equally to value-based and policy-based reinforcement learning algorithms.
- Score: 56.82218103971113
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Using privileged information during training can improve the sample
efficiency and performance of machine learning systems. This paradigm has been
applied to reinforcement learning (RL), primarily in the form of distillation
or auxiliary tasks, and less commonly in the form of augmenting the inputs of
agents. In this work, we investigate Privileged Information Dropout (\pid) for
achieving the latter which can be applied equally to value-based and
policy-based RL algorithms. Within a simple partially-observed environment, we
demonstrate that \pid outperforms alternatives for leveraging privileged
information, including distillation and auxiliary tasks, and can successfully
utilise different types of privileged information. Finally, we analyse its
effect on the learned representations.
Related papers
- Mind the Interference: Retaining Pre-trained Knowledge in Parameter Efficient Continual Learning of Vision-Language Models [79.28821338925947]
Domain-Class Incremental Learning is a realistic but challenging continual learning scenario.
To handle these diverse tasks, pre-trained Vision-Language Models (VLMs) are introduced for their strong generalizability.
This incurs a new problem: the knowledge encoded in the pre-trained VLMs may be disturbed when adapting to new tasks, compromising their inherent zero-shot ability.
Existing methods tackle it by tuning VLMs with knowledge distillation on extra datasets, which demands heavy overhead.
We propose the Distribution-aware Interference-free Knowledge Integration (DIKI) framework, retaining pre-trained knowledge of
arXiv Detail & Related papers (2024-07-07T12:19:37Z) - Improving Sample Efficiency of Reinforcement Learning with Background Knowledge from Large Language Models [33.504700578933424]
Low sample efficiency is an enduring challenge of reinforcement learning (RL)
We introduce a framework that harnesses large language models to extract background knowledge of an environment.
Our experiments show that these methods achieve significant sample efficiency improvements in a spectrum of downstream tasks.
arXiv Detail & Related papers (2024-07-04T14:33:47Z) - Learning Future Representation with Synthetic Observations for Sample-efficient Reinforcement Learning [12.277005054008017]
In visual Reinforcement Learning (RL), upstream representation learning largely determines the effect of downstream policy learning.
We try to improve auxiliary representation learning for RL by enriching auxiliary training data.
We propose a training-free method to synthesize observations that may contain future information.
The remaining synthetic observations and real observations then serve as the auxiliary data to achieve a clustering-based temporal association task.
arXiv Detail & Related papers (2024-05-20T02:43:04Z) - ALP: Action-Aware Embodied Learning for Perception [60.64801970249279]
We introduce Action-Aware Embodied Learning for Perception (ALP)
ALP incorporates action information into representation learning through a combination of optimizing a reinforcement learning policy and an inverse dynamics prediction objective.
We show that ALP outperforms existing baselines in several downstream perception tasks.
arXiv Detail & Related papers (2023-06-16T21:51:04Z) - Representation Learning in Deep RL via Discrete Information Bottleneck [39.375822469572434]
We study how information bottlenecks can be used to construct latent states efficiently in the presence of task-irrelevant information.
We propose architectures that utilize variational and discrete information bottlenecks, coined as RepDIB, to learn structured factorized representations.
arXiv Detail & Related papers (2022-12-28T14:38:12Z) - CCLF: A Contrastive-Curiosity-Driven Learning Framework for
Sample-Efficient Reinforcement Learning [56.20123080771364]
We develop a model-agnostic Contrastive-Curiosity-Driven Learning Framework (CCLF) for reinforcement learning.
CCLF fully exploit sample importance and improve learning efficiency in a self-supervised manner.
We evaluate this approach on the DeepMind Control Suite, Atari, and MiniGrid benchmarks.
arXiv Detail & Related papers (2022-05-02T14:42:05Z) - Robust Representation Learning via Perceptual Similarity Metrics [18.842322467828502]
Contrastive Input Morphing (CIM) is a representation learning framework that learns input-space transformations of the data.
We show that CIM is complementary to other mutual information-based representation learning techniques.
arXiv Detail & Related papers (2021-06-11T21:45:44Z) - Reinforcement Learning with Prototypical Representations [114.35801511501639]
Proto-RL is a self-supervised framework that ties representation learning with exploration through prototypical representations.
These prototypes simultaneously serve as a summarization of the exploratory experience of an agent as well as a basis for representing observations.
This enables state-of-the-art downstream policy learning on a set of difficult continuous control tasks.
arXiv Detail & Related papers (2021-02-22T18:56:34Z) - Parrot: Data-Driven Behavioral Priors for Reinforcement Learning [79.32403825036792]
We propose a method for pre-training behavioral priors that can capture complex input-output relationships observed in successful trials.
We show how this learned prior can be used for rapidly learning new tasks without impeding the RL agent's ability to try out novel behaviors.
arXiv Detail & Related papers (2020-11-19T18:47:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.