Improving Robot Dual-System Motor Learning with Intrinsically Motivated
Meta-Control and Latent-Space Experience Imagination
- URL: http://arxiv.org/abs/2004.08830v3
- Date: Sun, 1 Nov 2020 09:12:31 GMT
- Title: Improving Robot Dual-System Motor Learning with Intrinsically Motivated
Meta-Control and Latent-Space Experience Imagination
- Authors: Muhammad Burhan Hafez, Cornelius Weber, Matthias Kerzel, Stefan
Wermter
- Abstract summary: We present a novel dual-system motor learning approach where a meta-controller arbitrates online between model-based and model-free decisions.
We evaluate our approach against baseline and state-of-the-art methods on learning vision-based robotic grasping in simulation and real world.
- Score: 17.356402088852423
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Combining model-based and model-free learning systems has been shown to
improve the sample efficiency of learning to perform complex robotic tasks.
However, dual-system approaches fail to consider the reliability of the learned
model when it is applied to make multiple-step predictions, resulting in a
compounding of prediction errors and performance degradation. In this paper, we
present a novel dual-system motor learning approach where a meta-controller
arbitrates online between model-based and model-free decisions based on an
estimate of the local reliability of the learned model. The reliability
estimate is used in computing an intrinsic feedback signal, encouraging actions
that lead to data that improves the model. Our approach also integrates
arbitration with imagination where a learned latent-space model generates
imagined experiences, based on its local reliability, to be used as additional
training data. We evaluate our approach against baseline and state-of-the-art
methods on learning vision-based robotic grasping in simulation and real world.
The results show that our approach outperforms the compared methods and learns
near-optimal grasping policies in dense- and sparse-reward environments.
Related papers
- Robotic World Model: A Neural Network Simulator for Robust Policy Optimization in Robotics [50.191655141020505]
We introduce a novel framework for learning world models.
By providing a scalable and robust framework, we pave the way for adaptive and efficient robotic systems in real-world applications.
arXiv Detail & Related papers (2025-01-17T10:39:09Z) - Towards the Best Solution for Complex System Reliability: Can Statistics Outperform Machine Learning? [39.58317527488534]
This study compares the effectiveness of classical statistical techniques and machine learning methods for improving reliability assessments.
We aim to demonstrate that classical statistical algorithms often yield more precise and interpretable results than black-box machine learning approaches.
arXiv Detail & Related papers (2024-10-05T17:31:18Z) - Getting More Juice Out of the SFT Data: Reward Learning from Human Demonstration Improves SFT for LLM Alignment [65.15914284008973]
We propose to leverage an Inverse Reinforcement Learning (IRL) technique to simultaneously build an reward model and a policy model.
We show that the proposed algorithms converge to the stationary solutions of the IRL problem.
Our results indicate that it is beneficial to leverage reward learning throughout the entire alignment process.
arXiv Detail & Related papers (2024-05-28T07:11:05Z) - Deep autoregressive density nets vs neural ensembles for model-based
offline reinforcement learning [2.9158689853305693]
We consider a model-based reinforcement learning algorithm that infers the system dynamics from the available data and performs policy optimization on imaginary model rollouts.
This approach is vulnerable to exploiting model errors which can lead to catastrophic failures on the real system.
We show that better performance can be obtained with a single well-calibrated autoregressive model on the D4RL benchmark.
arXiv Detail & Related papers (2024-02-05T10:18:15Z) - QualEval: Qualitative Evaluation for Model Improvement [82.73561470966658]
We propose QualEval, which augments quantitative scalar metrics with automated qualitative evaluation as a vehicle for model improvement.
QualEval uses a powerful LLM reasoner and our novel flexible linear programming solver to generate human-readable insights.
We demonstrate that leveraging its insights, for example, improves the absolute performance of the Llama 2 model by up to 15% points relative.
arXiv Detail & Related papers (2023-11-06T00:21:44Z) - Real-to-Sim: Predicting Residual Errors of Robotic Systems with Sparse
Data using a Learning-based Unscented Kalman Filter [65.93205328894608]
We learn the residual errors between a dynamic and/or simulator model and the real robot.
We show that with the learned residual errors, we can further close the reality gap between dynamic models, simulations, and actual hardware.
arXiv Detail & Related papers (2022-09-07T15:15:12Z) - Model-Based Imitation Learning Using Entropy Regularization of Model and
Policy [0.456877715768796]
We propose model-based Entropy-Regularized Imitation Learning (MB-ERIL) under the entropy-regularized Markov decision process.
A policy discriminator distinguishes the actions generated by a robot from expert ones, and a model discriminator distinguishes the counterfactual state transitions generated by the model from the actual ones.
Computer simulations and real robot experiments show that MB-ERIL achieves a competitive performance and significantly improves the sample efficiency compared to baseline methods.
arXiv Detail & Related papers (2022-06-21T04:15:12Z) - Uncertainty-Aware Model-Based Reinforcement Learning with Application to
Autonomous Driving [2.3303341607459687]
We propose a novel uncertainty-aware model-based reinforcement learning framework, and then implement and validate it in autonomous driving.
The framework is developed based on the adaptive truncation approach, providing virtual interactions between the agent and environment model.
The developed algorithms are then implemented in end-to-end autonomous vehicle control tasks, validated and compared with state-of-the-art methods under various driving scenarios.
arXiv Detail & Related papers (2021-06-23T06:55:14Z) - Model Predictive Actor-Critic: Accelerating Robot Skill Acquisition with
Deep Reinforcement Learning [42.525696463089794]
Model Predictive Actor-Critic (MoPAC) is a hybrid model-based/model-free method that combines model predictive rollouts with policy optimization as to mitigate model bias.
MoPAC guarantees optimal skill learning up to an approximation error and reduces necessary physical interaction with the environment.
arXiv Detail & Related papers (2021-03-25T13:50:24Z) - Model-Based Deep Learning [155.063817656602]
Signal processing, communications, and control have traditionally relied on classical statistical modeling techniques.
Deep neural networks (DNNs) use generic architectures which learn to operate from data, and demonstrate excellent performance.
We are interested in hybrid techniques that combine principled mathematical models with data-driven systems to benefit from the advantages of both approaches.
arXiv Detail & Related papers (2020-12-15T16:29:49Z) - Bridging Imagination and Reality for Model-Based Deep Reinforcement
Learning [72.18725551199842]
We propose a novel model-based reinforcement learning algorithm, called BrIdging Reality and Dream (BIRD)
It maximizes the mutual information between imaginary and real trajectories so that the policy improvement learned from imaginary trajectories can be easily generalized to real trajectories.
We demonstrate that our approach improves sample efficiency of model-based planning, and achieves state-of-the-art performance on challenging visual control benchmarks.
arXiv Detail & Related papers (2020-10-23T03:22:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.