Related papers: Large-Scale Retrieval for Reinforcement Learning

Large-Scale Retrieval for Reinforcement Learning

URL: http://arxiv.org/abs/2206.05314v1
Date: Fri, 10 Jun 2022 18:25:30 GMT
Title: Large-Scale Retrieval for Reinforcement Learning
Authors: Peter C. Humphreys, Arthur Guez, Olivier Tieleman, Laurent Sifre, Th\'eophane Weber, Timothy Lillicrap
Abstract summary: In reinforcement learning, the dominant paradigm is for an agent to amortise information that helps decision-making into its network weights. Here, we pursue an alternative approach in which agents can utilise large-scale context-sensitive database lookups to support their parametric computations.
Score: 15.372742113152233
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Effective decision making involves flexibly relating past experiences and relevant contextual information to a novel situation. In deep reinforcement learning, the dominant paradigm is for an agent to amortise information that helps decision-making into its network weights via gradient descent on training losses. Here, we pursue an alternative approach in which agents can utilise large-scale context-sensitive database lookups to support their parametric computations. This allows agents to directly learn in an end-to-end manner to utilise relevant information to inform their outputs. In addition, new information can be attended to by the agent, without retraining, by simply augmenting the retrieval dataset. We study this approach in Go, a challenging game for which the vast combinatorial state space privileges generalisation over direct matching to past experiences. We leverage fast, approximate nearest neighbor techniques in order to retrieve relevant data from a set of tens of millions of expert demonstration states. Attending to this information provides a significant boost to prediction accuracy and game-play performance over simply using these demonstrations as training trajectories, providing a compelling demonstration of the value of large-scale retrieval in reinforcement learning agents.

Related papers

Granularity Matters in Long-Tail Learning [62.30734737735273]
We offer a novel perspective on long-tail learning, inspired by an observation: datasets with finer granularity tend to be less affected by data imbalance. We introduce open-set auxiliary classes that are visually similar to existing ones, aiming to enhance representation learning for both head and tail classes. To prevent the overwhelming presence of auxiliary classes from disrupting training, we introduce a neighbor-silencing loss.
arXiv Detail & Related papers (2024-10-21T13:06:21Z)
Adaptive Memory Replay for Continual Learning [29.333341368722653]
Updating Foundation Models as new data becomes available can lead to catastrophic forgetting' We introduce a framework of adaptive memory replay for continual learning, where sampling of past data is phrased as a multi-armed bandit problem. We demonstrate the effectiveness of our approach, which maintains high performance while reducing forgetting by up to 10% at no training efficiency cost.
arXiv Detail & Related papers (2024-04-18T22:01:56Z)
ALP: Action-Aware Embodied Learning for Perception [60.64801970249279]
We introduce Action-Aware Embodied Learning for Perception (ALP) ALP incorporates action information into representation learning through a combination of optimizing a reinforcement learning policy and an inverse dynamics prediction objective. We show that ALP outperforms existing baselines in several downstream perception tasks.
arXiv Detail & Related papers (2023-06-16T21:51:04Z)
Accelerating exploration and representation learning with offline pre-training [52.6912479800592]
We show that exploration and representation learning can be improved by separately learning two different models from a single offline dataset. We show that learning a state representation using noise-contrastive estimation and a model of auxiliary reward can significantly improve the sample efficiency on the challenging NetHack benchmark.
arXiv Detail & Related papers (2023-03-31T18:03:30Z)
Membership Inference Attacks via Adversarial Examples [5.721380617450644]
Membership inference attacks are a novel direction of research which aims at recovering training data used by a learning algorithm. We develop a mean to measure the leakage of training data leveraging a quantity appearing as a proxy of the total variation of a trained model.
arXiv Detail & Related papers (2022-07-27T15:10:57Z)
Active Learning of Ordinal Embeddings: A User Study on Football Data [4.856635699699126]
Humans innately measure distance between instances in an unlabeled dataset using an unknown similarity function. This work uses deep metric learning to learn these user-defined similarity functions from few annotations for a large football trajectory dataset.
arXiv Detail & Related papers (2022-07-26T07:55:23Z)
Pre-Train Your Loss: Easy Bayesian Transfer Learning with Informative Priors [59.93972277761501]
We show that we can learn highly informative posteriors from the source task, through supervised or self-supervised approaches. This simple modular approach enables significant performance gains and more data-efficient learning on a variety of downstream classification and segmentation tasks.
arXiv Detail & Related papers (2022-05-20T16:19:30Z)
Adversarial Training Helps Transfer Learning via Better Representations [17.497590668804055]
Transfer learning aims to leverage models pre-trained on source data to efficiently adapt to target setting. Recent works empirically demonstrate that adversarial training in the source data can improve the ability of models to transfer to new domains. We show that adversarial training in the source data generates provably better representations, so fine-tuning on top of this representation leads to a more accurate predictor of the target data.
arXiv Detail & Related papers (2021-06-18T15:41:07Z)
Parrot: Data-Driven Behavioral Priors for Reinforcement Learning [79.32403825036792]
We propose a method for pre-training behavioral priors that can capture complex input-output relationships observed in successful trials. We show how this learned prior can be used for rapidly learning new tasks without impeding the RL agent's ability to try out novel behaviors.
arXiv Detail & Related papers (2020-11-19T18:47:40Z)
Self-Supervised Contrastive Learning for Efficient User Satisfaction Prediction in Conversational Agents [35.2098736872247]
We propose a self-supervised contrastive learning approach to learn user-agent interactions. We show that the pre-trained models using the self-supervised objective are transferable to the user satisfaction prediction. We also propose a novel few-shot transfer learning approach that ensures better transferability for very small sample sizes.
arXiv Detail & Related papers (2020-10-21T18:10:58Z)
Relation-Guided Representation Learning [53.60351496449232]
We propose a new representation learning method that explicitly models and leverages sample relations. Our framework well preserves the relations between samples. By seeking to embed samples into subspace, we show that our method can address the large-scale and out-of-sample problem.
arXiv Detail & Related papers (2020-07-11T10:57:45Z)
Privileged Information Dropout in Reinforcement Learning [56.82218103971113]
Using privileged information during training can improve the sample efficiency and performance of machine learning systems. In this work, we investigate Privileged Information Dropout (pid) for achieving the latter which can be applied equally to value-based and policy-based reinforcement learning algorithms.
arXiv Detail & Related papers (2020-05-19T05:32:33Z)

This list is automatically generated from the titles and abstracts of the papers in this site.