Autonomous Assessment of Demonstration Sufficiency via Bayesian Inverse
Reinforcement Learning
- URL: http://arxiv.org/abs/2211.15542v3
- Date: Tue, 2 Jan 2024 06:36:38 GMT
- Title: Autonomous Assessment of Demonstration Sufficiency via Bayesian Inverse
Reinforcement Learning
- Authors: Tu Trinh, Haoyu Chen, Daniel S. Brown
- Abstract summary: We propose a novel self-assessment approach based on inverse reinforcement learning and value-at-risk.
We show that our approach successfully enables robots to perform at users' desired performance levels.
- Score: 22.287031690633174
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We examine the problem of determining demonstration sufficiency: how can a
robot self-assess whether it has received enough demonstrations from an expert
to ensure a desired level of performance? To address this problem, we propose a
novel self-assessment approach based on Bayesian inverse reinforcement learning
and value-at-risk, enabling learning-from-demonstration ("LfD") robots to
compute high-confidence bounds on their performance and use these bounds to
determine when they have a sufficient number of demonstrations. We propose and
evaluate two definitions of sufficiency: (1) normalized expected value
difference, which measures regret with respect to the human's unobserved reward
function, and (2) percent improvement over a baseline policy. We demonstrate
how to formulate high-confidence bounds on both of these metrics. We evaluate
our approach in simulation for both discrete and continuous state-space domains
and illustrate the feasibility of developing a robotic system that can
accurately evaluate demonstration sufficiency. We also show that the robot can
utilize active learning in asking for demonstrations from specific states which
results in fewer demos needed for the robot to still maintain high confidence
in its policy. Finally, via a user study, we show that our approach
successfully enables robots to perform at users' desired performance levels,
without needing too many or perfectly optimal demonstrations.
Related papers
- AdaDemo: Data-Efficient Demonstration Expansion for Generalist Robotic Agent [75.91274222142079]
In this study, we aim to scale up demonstrations in a data-efficient way to facilitate the learning of generalist robotic agents.
AdaDemo is a framework designed to improve multi-task policy learning by actively and continually expanding the demonstration dataset.
arXiv Detail & Related papers (2024-04-11T01:59:29Z) - How Can Everyday Users Efficiently Teach Robots by Demonstrations? [3.6145826787059643]
We propose to use a measure of uncertainty, namely task-related information entropy, as a criterion for suggesting informative demonstration examples to human teachers.
The results indicated a substantial improvement in robot learning efficiency from the teacher's demonstrations.
arXiv Detail & Related papers (2023-10-19T18:21:39Z) - Skill Disentanglement for Imitation Learning from Suboptimal
Demonstrations [60.241144377865716]
We consider the imitation of sub-optimal demonstrations, with both a small clean demonstration set and a large noisy set.
We propose method by evaluating and imitating at the sub-demonstration level, encoding action primitives of varying quality into different skills.
arXiv Detail & Related papers (2023-06-13T17:24:37Z) - Distributional Instance Segmentation: Modeling Uncertainty and High
Confidence Predictions with Latent-MaskRCNN [77.0623472106488]
In this paper, we explore a class of distributional instance segmentation models using latent codes.
For robotic picking applications, we propose a confidence mask method to achieve the high precision necessary.
We show that our method can significantly reduce critical errors in robotic systems, including our newly released dataset of ambiguous scenes.
arXiv Detail & Related papers (2023-05-03T05:57:29Z) - Self-Improving Robots: End-to-End Autonomous Visuomotor Reinforcement
Learning [54.636562516974884]
In imitation and reinforcement learning, the cost of human supervision limits the amount of data that robots can be trained on.
In this work, we propose MEDAL++, a novel design for self-improving robotic systems.
The robot autonomously practices the task by learning to both do and undo the task, simultaneously inferring the reward function from the demonstrations.
arXiv Detail & Related papers (2023-03-02T18:51:38Z) - Leveraging Sequentiality in Reinforcement Learning from a Single
Demonstration [68.94506047556412]
We propose to leverage a sequential bias to learn control policies for complex robotic tasks using a single demonstration.
We show that DCIL-II can solve with unprecedented sample efficiency some challenging simulated tasks such as humanoid locomotion and stand-up.
arXiv Detail & Related papers (2022-11-09T10:28:40Z) - Learning Agile Skills via Adversarial Imitation of Rough Partial
Demonstrations [19.257876507104868]
Learning agile skills is one of the main challenges in robotics.
We propose a generative adversarial method for inferring reward functions from partial and potentially physically incompatible demonstrations.
We show that by using a Wasserstein GAN formulation and transitions from demonstrations with rough and partial information as input, we are able to extract policies that are robust and capable of imitating demonstrated behaviors.
arXiv Detail & Related papers (2022-06-23T13:34:11Z) - Learning Feasibility to Imitate Demonstrators with Different Dynamics [23.239058855103067]
The goal of learning from demonstrations is to learn a policy for an agent (imitator) by mimicking the behavior in the demonstrations.
We learn a feasibility metric that captures the likelihood of a demonstration being feasible by the imitator.
Our experiments on four simulated environments and on a real robot show that the policy learned with our approach achieves a higher expected return than prior works.
arXiv Detail & Related papers (2021-10-28T14:15:47Z) - Interactive Robot Training for Non-Markov Tasks [6.252236971703546]
We propose a Bayesian interactive robot training framework that allows the robot to learn from both demonstrations provided by a teacher.
We also present an active learning approach to identify the task execution with the most uncertain degree of acceptability.
We demonstrate the efficacy of our approach in a real-world setting through a user-study based on teaching a robot to set a dinner table.
arXiv Detail & Related papers (2020-03-04T18:19:05Z) - Scalable Multi-Task Imitation Learning with Autonomous Improvement [159.9406205002599]
We build an imitation learning system that can continuously improve through autonomous data collection.
We leverage the robot's own trials as demonstrations for tasks other than the one that the robot actually attempted.
In contrast to prior imitation learning approaches, our method can autonomously collect data with sparse supervision for continuous improvement.
arXiv Detail & Related papers (2020-02-25T18:56:42Z) - Heterogeneous Learning from Demonstration [0.0]
We propose a framework for learning from heterogeneous demonstration based upon Bayesian inference.
We evaluate a suite of approaches on a real-world dataset of gameplay from StarCraft II.
arXiv Detail & Related papers (2020-01-27T03:08:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.