Pretraining Representations for Data-Efficient Reinforcement Learning
- URL: http://arxiv.org/abs/2106.04799v1
- Date: Wed, 9 Jun 2021 04:14:27 GMT
- Title: Pretraining Representations for Data-Efficient Reinforcement Learning
- Authors: Max Schwarzer, Nitarshan Rajkumar, Michael Noukhovitch, Ankesh Anand,
Laurent Charlin, Devon Hjelm, Philip Bachman, Aaron Courville
- Abstract summary: We use unlabeled data to pretrain an encoder which is then finetuned on a small amount of task-specific data.
When limited to 100k steps of interaction on Atari games, our approach significantly surpasses prior work.
Our approach shows particular promise when combined with larger models as well as more diverse, task-aligned observational data.
- Score: 12.43475487724972
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Data efficiency is a key challenge for deep reinforcement learning. We
address this problem by using unlabeled data to pretrain an encoder which is
then finetuned on a small amount of task-specific data. To encourage learning
representations which capture diverse aspects of the underlying MDP, we employ
a combination of latent dynamics modelling and unsupervised goal-conditioned
RL. When limited to 100k steps of interaction on Atari games (equivalent to two
hours of human experience), our approach significantly surpasses prior work
combining offline representation pretraining with task-specific finetuning, and
compares favourably with other pretraining methods that require orders of
magnitude more data. Our approach shows particular promise when combined with
larger models as well as more diverse, task-aligned observational data --
approaching human-level performance and data-efficiency on Atari in our best
setting. We provide code associated with this work at
https://github.com/mila-iqia/SGI.
Related papers
- Denoising Pre-Training and Customized Prompt Learning for Efficient Multi-Behavior Sequential Recommendation [69.60321475454843]
We propose DPCPL, the first pre-training and prompt-tuning paradigm tailored for Multi-Behavior Sequential Recommendation.
In the pre-training stage, we propose a novel Efficient Behavior Miner (EBM) to filter out the noise at multiple time scales.
Subsequently, we propose to tune the pre-trained model in a highly efficient manner with the proposed Customized Prompt Learning (CPL) module.
arXiv Detail & Related papers (2024-08-21T06:48:38Z) - Deep Active Learning for Data Mining from Conflict Text Corpora [0.0]
This paper proposes one such approach that is inexpensive and high performance, leveraging active learning.
The approach shows performance similar to human (gold-standard) coding while reducing the amount of required human annotation by as much as 99%.
arXiv Detail & Related papers (2024-02-02T17:16:23Z) - Bad Students Make Great Teachers: Active Learning Accelerates Large-Scale Visual Understanding [9.112203072394648]
Power-law scaling indicates that large-scale training with uniform sampling is prohibitively slow.
Active learning methods aim to increase data efficiency by prioritizing learning on the most relevant examples.
arXiv Detail & Related papers (2023-12-08T19:26:13Z) - Efficient Grammatical Error Correction Via Multi-Task Training and
Optimized Training Schedule [55.08778142798106]
We propose auxiliary tasks that exploit the alignment between the original and corrected sentences.
We formulate each task as a sequence-to-sequence problem and perform multi-task training.
We find that the order of datasets used for training and even individual instances within a dataset may have important effects on the final performance.
arXiv Detail & Related papers (2023-11-20T14:50:12Z) - Knowledge Distillation as Efficient Pre-training: Faster Convergence,
Higher Data-efficiency, and Better Transferability [53.27240222619834]
Knowledge Distillation as Efficient Pre-training aims to efficiently transfer the learned feature representation from pre-trained models to new student models for future downstream tasks.
Our method performs comparably with supervised pre-training counterparts in 3 downstream tasks and 9 downstream datasets requiring 10x less data and 5x less pre-training time.
arXiv Detail & Related papers (2022-03-10T06:23:41Z) - BigSSL: Exploring the Frontier of Large-Scale Semi-Supervised Learning
for Automatic Speech Recognition [126.5605160882849]
We find that the combination of pre-training, self-training and scaling up model size greatly increases data efficiency.
We report on the universal benefits gained from using big pre-trained and self-trained models for a large set of downstream tasks.
arXiv Detail & Related papers (2021-09-27T17:59:19Z) - Omni-supervised Facial Expression Recognition via Distilled Data [120.11782405714234]
We propose omni-supervised learning to exploit reliable samples in a large amount of unlabeled data for network training.
We experimentally verify that the new dataset can significantly improve the ability of the learned FER model.
To tackle this, we propose to apply a dataset distillation strategy to compress the created dataset into several informative class-wise images.
arXiv Detail & Related papers (2020-05-18T09:36:51Z) - Improving Semantic Segmentation via Self-Training [75.07114899941095]
We show that we can obtain state-of-the-art results using a semi-supervised approach, specifically a self-training paradigm.
We first train a teacher model on labeled data, and then generate pseudo labels on a large set of unlabeled data.
Our robust training framework can digest human-annotated and pseudo labels jointly and achieve top performances on Cityscapes, CamVid and KITTI datasets.
arXiv Detail & Related papers (2020-04-30T17:09:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.