Action-Conditioned Contrastive Policy Pretraining
- URL: http://arxiv.org/abs/2204.02393v1
- Date: Tue, 5 Apr 2022 17:58:22 GMT
- Title: Action-Conditioned Contrastive Policy Pretraining
- Authors: Qihang Zhang, Zhenghao Peng, Bolei Zhou
- Abstract summary: Deep visuomotor policy learning achieves promising results in control tasks such as robotic manipulation and autonomous driving.
It requires a huge number of online interactions with the training environment, which limits its real-world application.
In this work, we aim to pretrain policy representations for driving tasks using hours-long uncurated YouTube videos.
- Score: 39.13710045468429
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Deep visuomotor policy learning achieves promising results in control tasks
such as robotic manipulation and autonomous driving, where the action is
generated from the visual input by the neural policy. However, it requires a
huge number of online interactions with the training environment, which limits
its real-world application. Compared to the popular unsupervised feature
learning for visual recognition, feature pretraining for visuomotor control
tasks is much less explored. In this work, we aim to pretrain policy
representations for driving tasks using hours-long uncurated YouTube videos. A
new contrastive policy pretraining method is developed to learn
action-conditioned features from video frames with action pseudo labels.
Experiments show that the resulting action-conditioned features bring
substantial improvements to the downstream reinforcement learning and imitation
learning tasks, outperforming the weights pretrained from previous unsupervised
learning methods. Code and models will be made publicly available.
Related papers
- Pre-trained Visual Dynamics Representations for Efficient Policy Learning [33.62440075940917]
We propose Pre-trained Visual Dynamics Representations (PVDR) to bridge the domain gap between videos and downstream tasks for efficient policy learning.
The pre-trained visual dynamics representations capture the visual dynamics prior knowledge in the videos.
This abstract prior knowledge can be readily adapted to downstream tasks and aligned with executable actions through online adaptation.
arXiv Detail & Related papers (2024-11-05T15:18:02Z) - Any-point Trajectory Modeling for Policy Learning [64.23861308947852]
We introduce Any-point Trajectory Modeling (ATM) to predict future trajectories of arbitrary points within a video frame.
ATM outperforms strong video pre-training baselines by 80% on average.
We show effective transfer learning of manipulation skills from human videos and videos from a different robot morphology.
arXiv Detail & Related papers (2023-12-28T23:34:43Z) - Robot Fine-Tuning Made Easy: Pre-Training Rewards and Policies for
Autonomous Real-World Reinforcement Learning [58.3994826169858]
We introduce RoboFuME, a reset-free fine-tuning system for robotic reinforcement learning.
Our insights are to utilize offline reinforcement learning techniques to ensure efficient online fine-tuning of a pre-trained policy.
Our method can incorporate data from an existing robot dataset and improve on a target task within as little as 3 hours of autonomous real-world experience.
arXiv Detail & Related papers (2023-10-23T17:50:08Z) - Reinforcement Learning with Action-Free Pre-Training from Videos [95.25074614579646]
We introduce a framework that learns representations useful for understanding the dynamics via generative pre-training on videos.
Our framework significantly improves both final performances and sample-efficiency of vision-based reinforcement learning.
arXiv Detail & Related papers (2022-03-25T19:44:09Z) - Distilling Motion Planner Augmented Policies into Visual Control
Policies for Robot Manipulation [26.47544415550067]
We propose to distill a state-based motion planner augmented policy to a visual control policy.
We evaluate our method on three manipulation tasks in obstructed environments.
Our framework is highly sample-efficient and outperforms the state-of-the-art algorithms.
arXiv Detail & Related papers (2021-11-11T18:52:00Z) - Augmenting Reinforcement Learning with Behavior Primitives for Diverse
Manipulation Tasks [17.13584584844048]
This work introduces MAnipulation Primitive-augmented reinforcement LEarning (MAPLE), a learning framework that augments standard reinforcement learning algorithms with a pre-defined library of behavior primitives.
We develop a hierarchical policy that involves the primitives and instantiates their executions with input parameters.
We demonstrate that MAPLE outperforms baseline approaches by a significant margin on a suite of simulated manipulation tasks.
arXiv Detail & Related papers (2021-10-07T17:44:33Z) - Human-in-the-Loop Imitation Learning using Remote Teleoperation [72.2847988686463]
We build a data collection system tailored to 6-DoF manipulation settings.
We develop an algorithm to train the policy iteratively on new data collected by the system.
We demonstrate that agents trained on data collected by our intervention-based system and algorithm outperform agents trained on an equivalent number of samples collected by non-interventional demonstrators.
arXiv Detail & Related papers (2020-12-12T05:30:35Z) - Neural Dynamic Policies for End-to-End Sensorimotor Learning [51.24542903398335]
The current dominant paradigm in sensorimotor control, whether imitation or reinforcement learning, is to train policies directly in raw action spaces.
We propose Neural Dynamic Policies (NDPs) that make predictions in trajectory distribution space.
NDPs outperform the prior state-of-the-art in terms of either efficiency or performance across several robotic control tasks.
arXiv Detail & Related papers (2020-12-04T18:59:32Z) - NEARL: Non-Explicit Action Reinforcement Learning for Robotic Control [15.720231070808696]
In this paper, we propose a novel hierarchical reinforcement learning framework without explicit action.
Our meta policy tries to manipulate the next optimal state and actual action is produced by the inverse dynamics model.
Under our framework, widely available state-only demonstrations can be exploited effectively for imitation learning.
arXiv Detail & Related papers (2020-11-02T15:28:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.