Reinforcement Learning in Robotic Motion Planning by Combined
Experience-based Planning and Self-Imitation Learning
- URL: http://arxiv.org/abs/2306.06754v1
- Date: Sun, 11 Jun 2023 19:47:46 GMT
- Title: Reinforcement Learning in Robotic Motion Planning by Combined
Experience-based Planning and Self-Imitation Learning
- Authors: Sha Luo, Lambert Schomaker
- Abstract summary: High-quality and representative data is essential for both Imitation Learning (IL)- and Reinforcement Learning (RL)-based motion planning tasks.
We propose self-imitation learning by planning plus (SILP+) algorithm, which embeds experience-based planning into the learning architecture.
Various experimental results show that SILP+ achieves better training efficiency higher and more stable success rate in complex motion planning tasks.
- Score: 7.919213739992465
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: High-quality and representative data is essential for both Imitation Learning
(IL)- and Reinforcement Learning (RL)-based motion planning tasks. For real
robots, it is challenging to collect enough qualified data either as
demonstrations for IL or experiences for RL due to safety considerations in
environments with obstacles. We target this challenge by proposing the
self-imitation learning by planning plus (SILP+) algorithm, which efficiently
embeds experience-based planning into the learning architecture to mitigate the
data-collection problem. The planner generates demonstrations based on
successfully visited states from the current RL policy, and the policy improves
by learning from these demonstrations. In this way, we relieve the demand for
human expert operators to collect demonstrations required by IL and improve the
RL performance as well. Various experimental results show that SILP+ achieves
better training efficiency higher and more stable success rate in complex
motion planning tasks compared to several other methods. Extensive tests on
physical robots illustrate the effectiveness of SILP+ in a physical setting.
Related papers
- AdaDemo: Data-Efficient Demonstration Expansion for Generalist Robotic Agent [75.91274222142079]
In this study, we aim to scale up demonstrations in a data-efficient way to facilitate the learning of generalist robotic agents.
AdaDemo is a framework designed to improve multi-task policy learning by actively and continually expanding the demonstration dataset.
arXiv Detail & Related papers (2024-04-11T01:59:29Z) - Tactile Active Inference Reinforcement Learning for Efficient Robotic
Manipulation Skill Acquisition [10.072992621244042]
We propose a novel method for skill learning in robotic manipulation called Tactile Active Inference Reinforcement Learning (Tactile-AIRL)
To enhance the performance of reinforcement learning (RL), we introduce active inference, which integrates model-based techniques and intrinsic curiosity into the RL process.
We demonstrate that our method achieves significantly high training efficiency in non-prehensile objects pushing tasks.
arXiv Detail & Related papers (2023-11-19T10:19:22Z) - Imitation Bootstrapped Reinforcement Learning [31.916571349600684]
imitation bootstrapped reinforcement learning (IBRL) is a novel framework for sample-efficient reinforcement learning.
We evaluate IBRL on 6 simulation and 3 real-world tasks spanning various difficulty levels.
arXiv Detail & Related papers (2023-11-03T19:03:20Z) - Efficient Learning of High Level Plans from Play [57.29562823883257]
We present Efficient Learning of High-Level Plans from Play (ELF-P), a framework for robotic learning that bridges motion planning and deep RL.
We demonstrate that ELF-P has significantly better sample efficiency than relevant baselines over multiple realistic manipulation tasks.
arXiv Detail & Related papers (2023-03-16T20:09:47Z) - Hindsight States: Blending Sim and Real Task Elements for Efficient
Reinforcement Learning [61.3506230781327]
In robotics, one approach to generate training data builds on simulations based on dynamics models derived from first principles.
Here, we leverage the imbalance in complexity of the dynamics to learn more sample-efficiently.
We validate our method on several challenging simulated tasks and demonstrate that it improves learning both alone and when combined with an existing hindsight algorithm.
arXiv Detail & Related papers (2023-03-03T21:55:04Z) - Lifelong Learning Metrics [63.8376359764052]
The DARPA Lifelong Learning Machines (L2M) program seeks to yield advances in artificial intelligence (AI) systems.
This document outlines a formalism for constructing and characterizing the performance of agents performing lifelong learning scenarios.
arXiv Detail & Related papers (2022-01-20T16:29:14Z) - Creativity of AI: Hierarchical Planning Model Learning for Facilitating
Deep Reinforcement Learning [19.470693909025798]
We introduce a novel deep reinforcement learning framework with symbolic options.
Our framework features a loop training procedure, which enables guiding the improvement of policy.
We conduct experiments on two domains, Montezuma's Revenge and Office World, respectively.
arXiv Detail & Related papers (2021-12-18T03:45:28Z) - Demonstration-Guided Reinforcement Learning with Learned Skills [23.376115889936628]
Demonstration-guided reinforcement learning (RL) is a promising approach for learning complex behaviors.
In this work, we aim to exploit this shared subtask structure to increase the efficiency of demonstration-guided RL.
We propose Skill-based Learning with Demonstrations (SkiLD), an algorithm for demonstration-guided RL that efficiently leverages the provided demonstrations.
arXiv Detail & Related papers (2021-07-21T17:59:34Z) - Self-Imitation Learning by Planning [3.996275177789895]
Imitation learning (IL) enables robots to acquire skills quickly by transferring expert knowledge.
In long-horizon motion planning tasks, a challenging problem in deploying IL and RL methods is how to generate and collect massive, broadly distributed data.
We propose self-imitation learning by planning (SILP), where demonstration data are collected automatically by planning on the visited states from the current policy.
SILP is inspired by the observation that successfully visited states in the early reinforcement learning stage are collision-free nodes in the graph-search based motion planner.
arXiv Detail & Related papers (2021-03-25T13:28:38Z) - A Framework for Efficient Robotic Manipulation [79.10407063260473]
We show that a single robotic arm can learn sparse-reward manipulation policies from pixels.
We show that, given only 10 demonstrations, a single robotic arm can learn sparse-reward manipulation policies from pixels.
arXiv Detail & Related papers (2020-12-14T22:18:39Z) - AWAC: Accelerating Online Reinforcement Learning with Offline Datasets [84.94748183816547]
We show that our method, advantage weighted actor critic (AWAC), enables rapid learning of skills with a combination of prior demonstration data and online experience.
Our results show that incorporating prior data can reduce the time required to learn a range of robotic skills to practical time-scales.
arXiv Detail & Related papers (2020-06-16T17:54:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.