VRL3: A Data-Driven Framework for Visual Deep Reinforcement Learning
- URL: http://arxiv.org/abs/2202.10324v3
- Date: Fri, 31 Mar 2023 06:41:29 GMT
- Title: VRL3: A Data-Driven Framework for Visual Deep Reinforcement Learning
- Authors: Che Wang, Xufang Luo, Keith Ross, Dongsheng Li
- Abstract summary: We propose VRL3, a data-driven framework for solving visual deep reinforcement learning (DRL) tasks.
Our framework has three stages: in stage 1, we leverage non-RL datasets to learn task-agnostic visual representations; in stage 2, we use offline RL data; in stage 3, we fine-tune the agent with online RL.
On a set of challenging hand manipulation tasks, VRL3 achieves an average of 780% better sample efficiency.
- Score: 14.869611817084015
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose VRL3, a powerful data-driven framework with a simple design for
solving challenging visual deep reinforcement learning (DRL) tasks. We analyze
a number of major obstacles in taking a data-driven approach, and present a
suite of design principles, novel findings, and critical insights about
data-driven visual DRL. Our framework has three stages: in stage 1, we leverage
non-RL datasets (e.g. ImageNet) to learn task-agnostic visual representations;
in stage 2, we use offline RL data (e.g. a limited number of expert
demonstrations) to convert the task-agnostic representations into more powerful
task-specific representations; in stage 3, we fine-tune the agent with online
RL. On a set of challenging hand manipulation tasks with sparse reward and
realistic visual inputs, compared to the previous SOTA, VRL3 achieves an
average of 780% better sample efficiency. And on the hardest task, VRL3 is
1220% more sample efficient (2440% when using a wider encoder) and solves the
task with only 10% of the computation. These significant results clearly
demonstrate the great potential of data-driven deep reinforcement learning.
Related papers
- M2CURL: Sample-Efficient Multimodal Reinforcement Learning via Self-Supervised Representation Learning for Robotic Manipulation [0.7564784873669823]
We propose Multimodal Contrastive Unsupervised Reinforcement Learning (M2CURL)
Our approach employs a novel multimodal self-supervised learning technique that learns efficient representations and contributes to faster convergence of RL algorithms.
We evaluate M2CURL on the Tactile Gym 2 simulator and we show that it significantly enhances the learning efficiency in different manipulation tasks.
arXiv Detail & Related papers (2024-01-30T14:09:35Z) - Scaling Data Generation in Vision-and-Language Navigation [116.95534559103788]
We propose an effective paradigm for generating large-scale data for learning.
We apply 1200+ photo-realistic environments from HM3D and Gibson datasets and synthesizes 4.9 million instruction trajectory pairs.
Thanks to our large-scale dataset, the performance of an existing agent can be pushed up (+11% absolute with regard to previous SoTA) to a significantly new best of 80% single-run success rate on the R2R test split by simple imitation learning.
arXiv Detail & Related papers (2023-07-28T16:03:28Z) - MoDem: Accelerating Visual Model-Based Reinforcement Learning with
Demonstrations [36.44386146801296]
Poor sample efficiency continues to be the primary challenge for deployment of deep Reinforcement Learning (RL) algorithms for real-world applications.
We find that leveraging just a handful of demonstrations can dramatically improve the sample-efficiency of model-based RL.
We empirically study three complex visuo-motor control domains and find that our method is 150%-250% more successful in completing sparse reward tasks.
arXiv Detail & Related papers (2022-12-12T04:28:50Z) - Light-weight probing of unsupervised representations for Reinforcement Learning [20.638410483549706]
We study whether linear probing can be a proxy evaluation task for the quality of unsupervised RL representation.
We show that the probing tasks are strongly rank correlated with the downstream RL performance on the Atari100k Benchmark.
This provides a more efficient method for exploring the space of pretraining algorithms and identifying promising pretraining recipes.
arXiv Detail & Related papers (2022-08-25T21:08:01Z) - Offline Visual Representation Learning for Embodied Navigation [50.442660137987275]
offline pretraining of visual representations with self-supervised learning (SSL)
Online finetuning of visuomotor representations on specific tasks with image augmentations under long learning schedules.
arXiv Detail & Related papers (2022-04-27T23:22:43Z) - X-Learner: Learning Cross Sources and Tasks for Universal Visual
Representation [71.51719469058666]
We propose a representation learning framework called X-Learner.
X-Learner learns the universal feature of multiple vision tasks supervised by various sources.
X-Learner achieves strong performance on different tasks without extra annotations, modalities and computational costs.
arXiv Detail & Related papers (2022-03-16T17:23:26Z) - 2nd Place Scheme on Action Recognition Track of ECCV 2020 VIPriors
Challenges: An Efficient Optical Flow Stream Guided Framework [57.847010327319964]
We propose a data-efficient framework that can train the model from scratch on small datasets.
Specifically, by introducing a 3D central difference convolution operation, we proposed a novel C3D neural network-based two-stream framework.
It is proved that our method can achieve a promising result even without a pre-trained model on large scale datasets.
arXiv Detail & Related papers (2020-08-10T09:50:28Z) - PointContrast: Unsupervised Pre-training for 3D Point Cloud
Understanding [107.02479689909164]
In this work, we aim at facilitating research on 3D representation learning.
We measure the effect of unsupervised pre-training on a large source set of 3D scenes.
arXiv Detail & Related papers (2020-07-21T17:59:22Z) - Generalized Hindsight for Reinforcement Learning [154.0545226284078]
We argue that low-reward data collected while trying to solve one task provides little to no signal for solving that particular task.
We present Generalized Hindsight: an approximate inverse reinforcement learning technique for relabeling behaviors with the right tasks.
arXiv Detail & Related papers (2020-02-26T18:57:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.