Boosting Reinforcement Learning and Planning with Demonstrations: A
Survey
- URL: http://arxiv.org/abs/2303.13489v2
- Date: Mon, 27 Mar 2023 19:25:01 GMT
- Title: Boosting Reinforcement Learning and Planning with Demonstrations: A
Survey
- Authors: Tongzhou Mu, Hao Su
- Abstract summary: We discuss the advantages of using demonstrations in sequential decision making.
We exemplify a practical pipeline for generating and utilizing demonstrations in the recently proposed ManiSkill robot learning benchmark.
- Score: 25.847796336059343
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Although reinforcement learning has seen tremendous success recently, this
kind of trial-and-error learning can be impractical or inefficient in complex
environments. The use of demonstrations, on the other hand, enables agents to
benefit from expert knowledge rather than having to discover the best action to
take through exploration. In this survey, we discuss the advantages of using
demonstrations in sequential decision making, various ways to apply
demonstrations in learning-based decision making paradigms (for example,
reinforcement learning and planning in the learned models), and how to collect
the demonstrations in various scenarios. Additionally, we exemplify a practical
pipeline for generating and utilizing demonstrations in the recently proposed
ManiSkill robot learning benchmark.
Related papers
- Demonstration Notebook: Finding the Most Suited In-Context Learning Example from Interactions [8.869100154323643]
We propose a novel prompt engineering workflow built around a novel object called the "demonstration notebook"
This notebook helps identify the most suitable in-context learning example for a question by gathering and reusing information from the LLM's past interactions.
Our experiments show that this approach outperforms all existing methods for automatic demonstration construction and selection.
arXiv Detail & Related papers (2024-06-16T10:02:20Z) - AdaDemo: Data-Efficient Demonstration Expansion for Generalist Robotic Agent [75.91274222142079]
In this study, we aim to scale up demonstrations in a data-efficient way to facilitate the learning of generalist robotic agents.
AdaDemo is a framework designed to improve multi-task policy learning by actively and continually expanding the demonstration dataset.
arXiv Detail & Related papers (2024-04-11T01:59:29Z) - Imitation Learning from Purified Demonstration [55.23663861003027]
We propose to purify the potential perturbations in imperfect demonstrations and conduct imitation learning from purified demonstrations.
We provide theoretical evidence supporting our approach, demonstrating that total variance distance between the purified and optimal demonstration distributions can be upper-bounded.
arXiv Detail & Related papers (2023-10-11T02:36:52Z) - Skill Disentanglement for Imitation Learning from Suboptimal
Demonstrations [60.241144377865716]
We consider the imitation of sub-optimal demonstrations, with both a small clean demonstration set and a large noisy set.
We propose method by evaluating and imitating at the sub-demonstration level, encoding action primitives of varying quality into different skills.
arXiv Detail & Related papers (2023-06-13T17:24:37Z) - A Survey of Demonstration Learning [0.0]
Demonstration Learning is a paradigm in which an agent learns to perform a task by imitating the behavior of an expert shown in demonstrations.
It is gaining significant traction due to having tremendous potential for learning complex behaviors from demonstrations.
Due to learning without interacting with the environment, demonstration learning would allow the automation of a wide range of real world applications such as robotics and healthcare.
arXiv Detail & Related papers (2023-03-20T15:22:10Z) - Out-of-Dynamics Imitation Learning from Multimodal Demonstrations [68.46458026983409]
We study out-of-dynamics imitation learning (OOD-IL), which relaxes the assumption to that the demonstrator and the imitator have the same state spaces.
OOD-IL enables imitation learning to utilize demonstrations from a wide range of demonstrators but introduces a new challenge.
We develop a better transferability measurement to tackle this newly-emerged challenge.
arXiv Detail & Related papers (2022-11-13T07:45:06Z) - Let Me Check the Examples: Enhancing Demonstration Learning via Explicit
Imitation [9.851250429233634]
Demonstration learning aims to guide the prompt prediction via providing answered demonstrations in the few shot settings.
Existing work onlycorporas the answered examples as demonstrations to the prompt template without any additional operation.
We introduce Imitation DEMOnstration Learning (Imitation-Demo) to strengthen demonstration learning via explicitly imitating human review behaviour.
arXiv Detail & Related papers (2022-08-31T06:59:36Z) - Visual Adversarial Imitation Learning using Variational Models [60.69745540036375]
Reward function specification remains a major impediment for learning behaviors through deep reinforcement learning.
Visual demonstrations of desired behaviors often presents an easier and more natural way to teach agents.
We develop a variational model-based adversarial imitation learning algorithm.
arXiv Detail & Related papers (2021-07-16T00:15:18Z) - Learning from Imperfect Demonstrations from Agents with Varying Dynamics [29.94164262533282]
We develop a metric composed of a feasibility score and an optimality score to measure how useful a demonstration is for imitation learning.
Our experiments on four environments in simulation and on a real robot show improved learned policies with higher expected return.
arXiv Detail & Related papers (2021-03-10T07:39:38Z) - Reinforcement Learning with Supervision from Noisy Demonstrations [38.00968774243178]
We propose a novel framework to adaptively learn the policy by jointly interacting with the environment and exploiting the expert demonstrations.
Experimental results in various environments with multiple popular reinforcement learning algorithms show that the proposed approach can learn robustly with noisy demonstrations.
arXiv Detail & Related papers (2020-06-14T06:03:06Z) - State-Only Imitation Learning for Dexterous Manipulation [63.03621861920732]
In this paper, we explore state-only imitation learning.
We train an inverse dynamics model and use it to predict actions for state-only demonstrations.
Our method performs on par with state-action approaches and considerably outperforms RL alone.
arXiv Detail & Related papers (2020-04-07T17:57:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.