PI-ARS: Accelerating Evolution-Learned Visual-Locomotion with Predictive
Information Representations
- URL: http://arxiv.org/abs/2207.13224v1
- Date: Wed, 27 Jul 2022 00:26:15 GMT
- Title: PI-ARS: Accelerating Evolution-Learned Visual-Locomotion with Predictive
Information Representations
- Authors: Kuang-Huei Lee, Ofir Nachum, Tingnan Zhang, Sergio Guadarrama, Jie
Tan, Wenhao Yu
- Abstract summary: Evolution Strategy (ES) algorithms have shown promising results in training complex robotic control policies.
PI-ARS combines a gradient-based representation learning technique, Predictive Information (PI), with a gradient-free ES algorithm, Augmented Random Search (ARS)
We show PI-ARS demonstrates significantly better learning efficiency and performance compared to the ARS baseline.
- Score: 32.37414300338581
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Evolution Strategy (ES) algorithms have shown promising results in training
complex robotic control policies due to their massive parallelism capability,
simple implementation, effective parameter-space exploration, and fast training
time. However, a key limitation of ES is its scalability to large capacity
models, including modern neural network architectures. In this work, we develop
Predictive Information Augmented Random Search (PI-ARS) to mitigate this
limitation by leveraging recent advancements in representation learning to
reduce the parameter search space for ES. Namely, PI-ARS combines a
gradient-based representation learning technique, Predictive Information (PI),
with a gradient-free ES algorithm, Augmented Random Search (ARS), to train
policies that can process complex robot sensory inputs and handle highly
nonlinear robot dynamics. We evaluate PI-ARS on a set of challenging
visual-locomotion tasks where a quadruped robot needs to walk on uneven
stepping stones, quincuncial piles, and moving platforms, as well as to
complete an indoor navigation task. Across all tasks, PI-ARS demonstrates
significantly better learning efficiency and performance compared to the ARS
baseline. We further validate our algorithm by demonstrating that the learned
policies can successfully transfer to a real quadruped robot, for example,
achieving a 100% success rate on the real-world stepping stone environment,
dramatically improving prior results achieving 40% success.
Related papers
- Precise and Dexterous Robotic Manipulation via Human-in-the-Loop Reinforcement Learning [47.785786984974855]
We present a human-in-the-loop vision-based RL system that demonstrates impressive performance on a diverse set of dexterous manipulation tasks.
Our approach integrates demonstrations and human corrections, efficient RL algorithms, and other system-level design choices to learn policies.
We show that our method significantly outperforms imitation learning baselines and prior RL approaches, with an average 2x improvement in success rate and 1.8x faster execution.
arXiv Detail & Related papers (2024-10-29T08:12:20Z) - SERL: A Software Suite for Sample-Efficient Robotic Reinforcement
Learning [85.21378553454672]
We develop a library containing a sample efficient off-policy deep RL method, together with methods for computing rewards and resetting the environment.
We find that our implementation can achieve very efficient learning, acquiring policies for PCB board assembly, cable routing, and object relocation.
These policies achieve perfect or near-perfect success rates, extreme robustness even under perturbations, and exhibit emergent robustness recovery and correction behaviors.
arXiv Detail & Related papers (2024-01-29T10:01:10Z) - Robot Fine-Tuning Made Easy: Pre-Training Rewards and Policies for
Autonomous Real-World Reinforcement Learning [58.3994826169858]
We introduce RoboFuME, a reset-free fine-tuning system for robotic reinforcement learning.
Our insights are to utilize offline reinforcement learning techniques to ensure efficient online fine-tuning of a pre-trained policy.
Our method can incorporate data from an existing robot dataset and improve on a target task within as little as 3 hours of autonomous real-world experience.
arXiv Detail & Related papers (2023-10-23T17:50:08Z) - Stabilizing Contrastive RL: Techniques for Robotic Goal Reaching from
Offline Data [101.43350024175157]
Self-supervised learning has the potential to decrease the amount of human annotation and engineering effort required to learn control strategies.
Our work builds on prior work showing that the reinforcement learning (RL) itself can be cast as a self-supervised problem.
We demonstrate that a self-supervised RL algorithm based on contrastive learning can solve real-world, image-based robotic manipulation tasks.
arXiv Detail & Related papers (2023-06-06T01:36:56Z) - Robot Learning on the Job: Human-in-the-Loop Autonomy and Learning
During Deployment [25.186525630548356]
Sirius is a principled framework for humans and robots to collaborate through a division of work.
Partially autonomous robots are tasked with handling a major portion of decision-making where they work reliably.
We introduce a new learning algorithm to improve the policy's performance on the data collected from the task executions.
arXiv Detail & Related papers (2022-11-15T18:53:39Z) - Accelerating Robotic Reinforcement Learning via Parameterized Action
Primitives [92.0321404272942]
Reinforcement learning can be used to build general-purpose robotic systems.
However, training RL agents to solve robotics tasks still remains challenging.
In this work, we manually specify a library of robot action primitives (RAPS), parameterized with arguments that are learned by an RL policy.
We find that our simple change to the action interface substantially improves both the learning efficiency and task performance.
arXiv Detail & Related papers (2021-10-28T17:59:30Z) - Bayesian Meta-Learning for Few-Shot Policy Adaptation Across Robotic
Platforms [60.59764170868101]
Reinforcement learning methods can achieve significant performance but require a large amount of training data collected on the same robotic platform.
We formulate it as a few-shot meta-learning problem where the goal is to find a model that captures the common structure shared across different robotic platforms.
We experimentally evaluate our framework on a simulated reaching and a real-robot picking task using 400 simulated robots.
arXiv Detail & Related papers (2021-03-05T14:16:20Z) - A Framework for Efficient Robotic Manipulation [79.10407063260473]
We show that a single robotic arm can learn sparse-reward manipulation policies from pixels.
We show that, given only 10 demonstrations, a single robotic arm can learn sparse-reward manipulation policies from pixels.
arXiv Detail & Related papers (2020-12-14T22:18:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.