Model Predictive Actor-Critic: Accelerating Robot Skill Acquisition with
Deep Reinforcement Learning
- URL: http://arxiv.org/abs/2103.13842v1
- Date: Thu, 25 Mar 2021 13:50:24 GMT
- Title: Model Predictive Actor-Critic: Accelerating Robot Skill Acquisition with
Deep Reinforcement Learning
- Authors: Andrew S. Morgan, Daljeet Nandha, Georgia Chalvatzaki, Carlo D'Eramo,
Aaron M. Dollar, and Jan Peters
- Abstract summary: Model Predictive Actor-Critic (MoPAC) is a hybrid model-based/model-free method that combines model predictive rollouts with policy optimization as to mitigate model bias.
MoPAC guarantees optimal skill learning up to an approximation error and reduces necessary physical interaction with the environment.
- Score: 42.525696463089794
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Substantial advancements to model-based reinforcement learning algorithms
have been impeded by the model-bias induced by the collected data, which
generally hurts performance. Meanwhile, their inherent sample efficiency
warrants utility for most robot applications, limiting potential damage to the
robot and its environment during training. Inspired by information theoretic
model predictive control and advances in deep reinforcement learning, we
introduce Model Predictive Actor-Critic (MoPAC), a hybrid
model-based/model-free method that combines model predictive rollouts with
policy optimization as to mitigate model bias. MoPAC leverages optimal
trajectories to guide policy learning, but explores via its model-free method,
allowing the algorithm to learn more expressive dynamics models. This
combination guarantees optimal skill learning up to an approximation error and
reduces necessary physical interaction with the environment, making it suitable
for real-robot training. We provide extensive results showcasing how our
proposed method generally outperforms current state-of-the-art and conclude by
evaluating MoPAC for learning on a physical robotic hand performing valve
rotation and finger gaiting--a task that requires grasping, manipulation, and
then regrasping of an object.
Related papers
- Robotic World Model: A Neural Network Simulator for Robust Policy Optimization in Robotics [50.191655141020505]
We introduce a novel framework for learning world models.
By providing a scalable and robust framework, we pave the way for adaptive and efficient robotic systems in real-world applications.
arXiv Detail & Related papers (2025-01-17T10:39:09Z) - Sample Efficient Robot Learning in Supervised Effect Prediction Tasks [0.0]
In this work, we develop a novel AL framework geared towards robotics regression tasks, such as action-effect prediction and, more generally, for world model learning, which we call MUSEL.
MUSEL aims to extract model uncertainty from the total uncertainty estimate given by a suitable learning engine by making use of earning progress and input diversity and use it to improve sample efficiency beyond the state-of-the-art action-effect prediction methods.
The efficacy of MUSEL is demonstrated by comparing its performance to standard methods used in robot action-effect learning.
arXiv Detail & Related papers (2024-12-03T09:48:28Z) - Learning Low-Dimensional Strain Models of Soft Robots by Looking at the Evolution of Their Shape with Application to Model-Based Control [2.058941610795796]
This paper introduces a streamlined method for learning low-dimensional, physics-based models.
We validate our approach through simulations with various planar soft manipulators.
Thanks to the capability of the method of generating physically compatible models, the learned models can be straightforwardly combined with model-based control policies.
arXiv Detail & Related papers (2024-10-31T18:37:22Z) - Active Exploration in Bayesian Model-based Reinforcement Learning for Robot Manipulation [8.940998315746684]
We propose a model-based reinforcement learning (RL) approach for robotic arm end-tasks.
We employ Bayesian neural network models to represent, in a probabilistic way, both the belief and information encoded in the dynamic model during exploration.
Our experiments show the advantages of our Bayesian model-based RL approach, with similar quality in the results than relevant alternatives.
arXiv Detail & Related papers (2024-04-02T11:44:37Z) - STORM: Efficient Stochastic Transformer based World Models for
Reinforcement Learning [82.03481509373037]
Recently, model-based reinforcement learning algorithms have demonstrated remarkable efficacy in visual input environments.
We introduce Transformer-based wORld Model (STORM), an efficient world model architecture that combines strong modeling and generation capabilities.
Storm achieves a mean human performance of $126.7%$ on the Atari $100$k benchmark, setting a new record among state-of-the-art methods.
arXiv Detail & Related papers (2023-10-14T16:42:02Z) - Real-to-Sim: Predicting Residual Errors of Robotic Systems with Sparse
Data using a Learning-based Unscented Kalman Filter [65.93205328894608]
We learn the residual errors between a dynamic and/or simulator model and the real robot.
We show that with the learned residual errors, we can further close the reality gap between dynamic models, simulations, and actual hardware.
arXiv Detail & Related papers (2022-09-07T15:15:12Z) - Sample-Efficient Reinforcement Learning via Conservative Model-Based
Actor-Critic [67.00475077281212]
Model-based reinforcement learning algorithms are more sample efficient than their model-free counterparts.
We propose a novel approach that achieves high sample efficiency without the strong reliance on accurate learned models.
We show that CMBAC significantly outperforms state-of-the-art approaches in terms of sample efficiency on several challenging tasks.
arXiv Detail & Related papers (2021-12-16T15:33:11Z) - Sample Efficient Reinforcement Learning via Model-Ensemble Exploration
and Exploitation [3.728946517493471]
MEEE is a model-ensemble method that consists of optimistic exploration and weighted exploitation.
Our approach outperforms other model-free and model-based state-of-the-art methods, especially in sample complexity.
arXiv Detail & Related papers (2021-07-05T07:18:20Z) - Information Theoretic Model Predictive Q-Learning [64.74041985237105]
We present a novel theoretical connection between information theoretic MPC and entropy regularized RL.
We develop a Q-learning algorithm that can leverage biased models.
arXiv Detail & Related papers (2019-12-31T00:29:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.