Robotic Lever Manipulation using Hindsight Experience Replay and Shapley
Additive Explanations
- URL: http://arxiv.org/abs/2110.03292v1
- Date: Thu, 7 Oct 2021 09:24:34 GMT
- Title: Robotic Lever Manipulation using Hindsight Experience Replay and Shapley
Additive Explanations
- Authors: Sindre Benjamin Remman and Anastasios M. Lekkas
- Abstract summary: This paper deals with robotic lever control using Explainable Deep Reinforcement Learning.
First, we train a policy by using the Deep Deterministic Policy Gradient algorithm and the Hindsight Experience Replay technique.
We then transfer the policy to the real-world environment, where it achieves comparable performance to the simulated environments for most episodes.
To explain the decisions of the policy we use the SHAP method to create an explanation model based on the episodes done in the real-world environment.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper deals with robotic lever control using Explainable Deep
Reinforcement Learning. First, we train a policy by using the Deep
Deterministic Policy Gradient algorithm and the Hindsight Experience Replay
technique, where the goal is to control a robotic manipulator to manipulate a
lever. This enables us both to use continuous states and actions and to learn
with sparse rewards. Being able to learn from sparse rewards is especially
desirable for Deep Reinforcement Learning because designing a reward function
for complex tasks such as this is challenging. We first train in the PyBullet
simulator, which accelerates the training procedure, but is not accurate on
this task compared to the real-world environment. After completing the training
in PyBullet, we further train in the Gazebo simulator, which runs more slowly
than PyBullet, but is more accurate on this task. We then transfer the policy
to the real-world environment, where it achieves comparable performance to the
simulated environments for most episodes. To explain the decisions of the
policy we use the SHAP method to create an explanation model based on the
episodes done in the real-world environment. This gives us some results that
agree with intuition, and some that do not. We also question whether the
independence assumption made when approximating the SHAP values influences the
accuracy of these values for a system such as this, where there are some
correlations between the states.
Related papers
- Autonomous Vehicle Controllers From End-to-End Differentiable Simulation [60.05963742334746]
We propose a differentiable simulator and design an analytic policy gradients (APG) approach to training AV controllers.
Our proposed framework brings the differentiable simulator into an end-to-end training loop, where gradients of environment dynamics serve as a useful prior to help the agent learn a more grounded policy.
We find significant improvements in performance and robustness to noise in the dynamics, as well as overall more intuitive human-like handling.
arXiv Detail & Related papers (2024-09-12T11:50:06Z) - Robust Visual Sim-to-Real Transfer for Robotic Manipulation [79.66851068682779]
Learning visuomotor policies in simulation is much safer and cheaper than in the real world.
However, due to discrepancies between the simulated and real data, simulator-trained policies often fail when transferred to real robots.
One common approach to bridge the visual sim-to-real domain gap is domain randomization (DR)
arXiv Detail & Related papers (2023-07-28T05:47:24Z) - DexPBT: Scaling up Dexterous Manipulation for Hand-Arm Systems with
Population Based Training [10.808149303943948]
We learn dexterous object manipulation using simulated one- or two-armed robots equipped with multi-fingered hand end-effectors.
We introduce a decentralized Population-Based Training (PBT) algorithm that allows us to massively amplify the exploration capabilities of deep reinforcement learning.
arXiv Detail & Related papers (2023-05-20T07:25:27Z) - DeXtreme: Transfer of Agile In-hand Manipulation from Simulation to
Reality [64.51295032956118]
We train a policy that can perform robust dexterous manipulation on an anthropomorphic robot hand.
Our work reaffirms the possibilities of sim-to-real transfer for dexterous manipulation in diverse kinds of hardware and simulator setups.
arXiv Detail & Related papers (2022-10-25T01:51:36Z) - Real-World Dexterous Object Manipulation based Deep Reinforcement
Learning [3.4493195428573613]
We show how to use deep reinforcement learning to control a robot.
Our framework reduces the disadvantage of low sample efficiency of deep reinforcement learning.
Our algorithm is trained in simulation and migrated to reality without fine-tuning.
arXiv Detail & Related papers (2021-11-22T02:48:05Z) - Pre-training of Deep RL Agents for Improved Learning under Domain
Randomization [63.09932240840656]
We show how to pre-train a perception encoder that already provides an embedding invariant to the randomization.
We demonstrate this yields consistently improved results on a randomized version of DeepMind control suite tasks and a stacking environment on arbitrary backgrounds with zero-shot transfer to a physical robot.
arXiv Detail & Related papers (2021-04-29T14:54:11Z) - Learning What To Do by Simulating the Past [76.86449554580291]
We show that by combining a learned feature encoder with learned inverse models, we can enable agents to simulate human actions backwards in time to infer what they must have done.
The resulting algorithm is able to reproduce a specific skill in MuJoCo environments given a single state sampled from the optimal policy for that skill.
arXiv Detail & Related papers (2021-04-08T17:43:29Z) - Accelerated Sim-to-Real Deep Reinforcement Learning: Learning Collision
Avoidance from Human Player [5.960346570280513]
This paper presents a sensor-level mapless collision avoidance algorithm for use in mobile robots.
An efficient training strategy is proposed to allow a robot to learn from both human experience data and self-exploratory data.
A game format simulation framework is designed to allow the human player to tele-operate the mobile robot to a goal.
arXiv Detail & Related papers (2021-02-21T23:27:34Z) - Semi-supervised reward learning for offline reinforcement learning [71.6909757718301]
Training agents usually requires reward functions, but rewards are seldom available in practice and their engineering is challenging and laborious.
We propose semi-supervised learning algorithms that learn from limited annotations and incorporate unlabelled data.
In our experiments with a simulated robotic arm, we greatly improve upon behavioural cloning and closely approach the performance achieved with ground truth rewards.
arXiv Detail & Related papers (2020-12-12T20:06:15Z) - Robotic Arm Control and Task Training through Deep Reinforcement
Learning [6.249276977046449]
We show that Trust Region Policy Optimization and DeepQ-Network with Normalized Advantage Functions perform better than Deep Deterministic Policy Gradient and Vanilla Policy Gradient.
Real-world experiments let show that our polices, if correctly trained on simulation, can be transferred and executed in a real environment with almost no changes.
arXiv Detail & Related papers (2020-05-06T07:34:28Z) - On Reward Shaping for Mobile Robot Navigation: A Reinforcement Learning
and SLAM Based Approach [7.488722678999039]
We present a map-less path planning algorithm based on Deep Reinforcement Learning (DRL) for mobile robots navigating in unknown environment.
The planner is trained using a reward function shaped based on the online knowledge of the map of the training environment.
The policy trained in the simulation environment can be directly and successfully transferred to the real robot.
arXiv Detail & Related papers (2020-02-10T22:00:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.