Boosting Reinforcement Learning and Planning with Demonstrations: A
Survey
- URL: http://arxiv.org/abs/2303.13489v2
- Date: Mon, 27 Mar 2023 19:25:01 GMT
- Title: Boosting Reinforcement Learning and Planning with Demonstrations: A
Survey
- Authors: Tongzhou Mu, Hao Su
- Abstract summary: We discuss the advantages of using demonstrations in sequential decision making.
We exemplify a practical pipeline for generating and utilizing demonstrations in the recently proposed ManiSkill robot learning benchmark.
- Score: 25.847796336059343
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Although reinforcement learning has seen tremendous success recently, this
kind of trial-and-error learning can be impractical or inefficient in complex
environments. The use of demonstrations, on the other hand, enables agents to
benefit from expert knowledge rather than having to discover the best action to
take through exploration. In this survey, we discuss the advantages of using
demonstrations in sequential decision making, various ways to apply
demonstrations in learning-based decision making paradigms (for example,
reinforcement learning and planning in the learned models), and how to collect
the demonstrations in various scenarios. Additionally, we exemplify a practical
pipeline for generating and utilizing demonstrations in the recently proposed
ManiSkill robot learning benchmark.
Related papers
- Screw Geometry Meets Bandits: Incremental Acquisition of Demonstrations to Generate Manipulation Plans [9.600625243282618]
We study the problem of methodically obtaining a sufficient set of kinesthetic demonstrations, one at a time.
We present a novel approach to address these open problems using (i) a screw geometric representation to generate manipulation plans from demonstrations.
We present experimental results on two example manipulation tasks, namely, pouring and scooping, to illustrate our approach.
arXiv Detail & Related papers (2024-10-23T20:57:56Z) - Demonstration Notebook: Finding the Most Suited In-Context Learning Example from Interactions [8.869100154323643]
We propose a novel prompt engineering workflow built around a novel object called the "demonstration notebook"
This notebook helps identify the most suitable in-context learning example for a question by gathering and reusing information from the LLM's past interactions.
Our experiments show that this approach outperforms all existing methods for automatic demonstration construction and selection.
arXiv Detail & Related papers (2024-06-16T10:02:20Z) - AdaDemo: Data-Efficient Demonstration Expansion for Generalist Robotic Agent [75.91274222142079]
In this study, we aim to scale up demonstrations in a data-efficient way to facilitate the learning of generalist robotic agents.
AdaDemo is a framework designed to improve multi-task policy learning by actively and continually expanding the demonstration dataset.
arXiv Detail & Related papers (2024-04-11T01:59:29Z) - Skill Disentanglement for Imitation Learning from Suboptimal
Demonstrations [60.241144377865716]
We consider the imitation of sub-optimal demonstrations, with both a small clean demonstration set and a large noisy set.
We propose method by evaluating and imitating at the sub-demonstration level, encoding action primitives of varying quality into different skills.
arXiv Detail & Related papers (2023-06-13T17:24:37Z) - A Survey of Demonstration Learning [0.0]
Demonstration Learning is a paradigm in which an agent learns to perform a task by imitating the behavior of an expert shown in demonstrations.
It is gaining significant traction due to having tremendous potential for learning complex behaviors from demonstrations.
Due to learning without interacting with the environment, demonstration learning would allow the automation of a wide range of real world applications such as robotics and healthcare.
arXiv Detail & Related papers (2023-03-20T15:22:10Z) - Out-of-Dynamics Imitation Learning from Multimodal Demonstrations [68.46458026983409]
We study out-of-dynamics imitation learning (OOD-IL), which relaxes the assumption to that the demonstrator and the imitator have the same state spaces.
OOD-IL enables imitation learning to utilize demonstrations from a wide range of demonstrators but introduces a new challenge.
We develop a better transferability measurement to tackle this newly-emerged challenge.
arXiv Detail & Related papers (2022-11-13T07:45:06Z) - Robustness of Demonstration-based Learning Under Limited Data Scenario [54.912936555876826]
Demonstration-based learning has shown great potential in stimulating pretrained language models' ability under limited data scenario.
Why such demonstrations are beneficial for the learning process remains unclear since there is no explicit alignment between the demonstrations and the predictions.
In this paper, we design pathological demonstrations by gradually removing intuitively useful information from the standard ones to take a deep dive of the robustness of demonstration-based sequence labeling.
arXiv Detail & Related papers (2022-10-19T16:15:04Z) - Let Me Check the Examples: Enhancing Demonstration Learning via Explicit
Imitation [9.851250429233634]
Demonstration learning aims to guide the prompt prediction via providing answered demonstrations in the few shot settings.
Existing work onlycorporas the answered examples as demonstrations to the prompt template without any additional operation.
We introduce Imitation DEMOnstration Learning (Imitation-Demo) to strengthen demonstration learning via explicitly imitating human review behaviour.
arXiv Detail & Related papers (2022-08-31T06:59:36Z) - Visual Adversarial Imitation Learning using Variational Models [60.69745540036375]
Reward function specification remains a major impediment for learning behaviors through deep reinforcement learning.
Visual demonstrations of desired behaviors often presents an easier and more natural way to teach agents.
We develop a variational model-based adversarial imitation learning algorithm.
arXiv Detail & Related papers (2021-07-16T00:15:18Z) - Reinforcement Learning with Supervision from Noisy Demonstrations [38.00968774243178]
We propose a novel framework to adaptively learn the policy by jointly interacting with the environment and exploiting the expert demonstrations.
Experimental results in various environments with multiple popular reinforcement learning algorithms show that the proposed approach can learn robustly with noisy demonstrations.
arXiv Detail & Related papers (2020-06-14T06:03:06Z) - State-Only Imitation Learning for Dexterous Manipulation [63.03621861920732]
In this paper, we explore state-only imitation learning.
We train an inverse dynamics model and use it to predict actions for state-only demonstrations.
Our method performs on par with state-action approaches and considerably outperforms RL alone.
arXiv Detail & Related papers (2020-04-07T17:57:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.