ThriftyDAgger: Budget-Aware Novelty and Risk Gating for Interactive
Imitation Learning
- URL: http://arxiv.org/abs/2109.08273v1
- Date: Fri, 17 Sep 2021 01:21:16 GMT
- Title: ThriftyDAgger: Budget-Aware Novelty and Risk Gating for Interactive
Imitation Learning
- Authors: Ryan Hoque, Ashwin Balakrishna, Ellen Novoseller, Albert Wilcox,
Daniel S. Brown, Ken Goldberg
- Abstract summary: ThriftyDAgger is an algorithm for querying a human supervisor given a desired budget of human interventions.
Experiments suggest that ThriftyDAgger's intervention criteria balances task performance and supervisor burden more effectively than prior algorithms.
- Score: 23.177329496817105
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Effective robot learning often requires online human feedback and
interventions that can cost significant human time, giving rise to the central
challenge in interactive imitation learning: is it possible to control the
timing and length of interventions to both facilitate learning and limit burden
on the human supervisor? This paper presents ThriftyDAgger, an algorithm for
actively querying a human supervisor given a desired budget of human
interventions. ThriftyDAgger uses a learned switching policy to solicit
interventions only at states that are sufficiently (1) novel, where the robot
policy has no reference behavior to imitate, or (2) risky, where the robot has
low confidence in task completion. To detect the latter, we introduce a novel
metric for estimating risk under the current robot policy. Experiments in
simulation and on a physical cable routing experiment suggest that
ThriftyDAgger's intervention criteria balances task performance and supervisor
burden more effectively than prior algorithms. ThriftyDAgger can also be
applied at execution time, where it achieves a 100% success rate on both the
simulation and physical tasks. A user study (N=10) in which users control a
three-robot fleet while also performing a concentration task suggests that
ThriftyDAgger increases human and robot performance by 58% and 80% respectively
compared to the next best algorithm while reducing supervisor burden.
Related papers
- Robot-Gated Interactive Imitation Learning with Adaptive Intervention Mechanism [48.41735416075536]
Interactive Imitation Learning (IIL) allows agents to acquire desired behaviors through human interventions.<n>We propose the Adaptive Intervention Mechanism (AIM), a novel robot-gated IIL algorithm that learns an adaptive criterion for requesting human demonstrations.
arXiv Detail & Related papers (2025-06-10T18:43:26Z) - Human-Agent Joint Learning for Efficient Robot Manipulation Skill Acquisition [48.65867987106428]
We introduce a novel system for joint learning between human operators and robots.
It enables human operators to share control of a robot end-effector with a learned assistive agent.
It reduces the need for human adaptation while ensuring the collected data is of sufficient quality for downstream tasks.
arXiv Detail & Related papers (2024-06-29T03:37:29Z) - Self-Improving Robots: End-to-End Autonomous Visuomotor Reinforcement
Learning [54.636562516974884]
In imitation and reinforcement learning, the cost of human supervision limits the amount of data that robots can be trained on.
In this work, we propose MEDAL++, a novel design for self-improving robotic systems.
The robot autonomously practices the task by learning to both do and undo the task, simultaneously inferring the reward function from the demonstrations.
arXiv Detail & Related papers (2023-03-02T18:51:38Z) - Robot Learning on the Job: Human-in-the-Loop Autonomy and Learning
During Deployment [25.186525630548356]
Sirius is a principled framework for humans and robots to collaborate through a division of work.
Partially autonomous robots are tasked with handling a major portion of decision-making where they work reliably.
We introduce a new learning algorithm to improve the policy's performance on the data collected from the task executions.
arXiv Detail & Related papers (2022-11-15T18:53:39Z) - Learning Action Duration and Synergy in Task Planning for Human-Robot
Collaboration [6.373435464104705]
The duration of an action depends on agents' capabilities and the correlation between actions performed simultaneously by the human and the robot.
This paper proposes an approach to learning actions' costs and coupling between actions executed concurrently by humans and robots.
arXiv Detail & Related papers (2022-10-21T01:08:11Z) - Revisiting the Adversarial Robustness-Accuracy Tradeoff in Robot
Learning [121.9708998627352]
Recent work has shown that, in practical robot learning applications, the effects of adversarial training do not pose a fair trade-off.
This work revisits the robustness-accuracy trade-off in robot learning by analyzing if recent advances in robust training methods and theory can make adversarial training suitable for real-world robot applications.
arXiv Detail & Related papers (2022-04-15T08:12:15Z) - Active Uncertainty Learning for Human-Robot Interaction: An Implicit
Dual Control Approach [5.05828899601167]
We present an algorithmic approach to enable uncertainty learning for human-in-the-loop motion planning based on the implicit dual control paradigm.
Our approach relies on sampling-based approximation of dynamic programming model predictive control problem.
The resulting policy is shown to preserve the dual control effect for generic human predictive models with both continuous and categorical uncertainty.
arXiv Detail & Related papers (2022-02-15T20:40:06Z) - Persistent Reinforcement Learning via Subgoal Curricula [114.83989499740193]
Value-accelerated Persistent Reinforcement Learning (VaPRL) generates a curriculum of initial states.
VaPRL reduces the interventions required by three orders of magnitude compared to episodic reinforcement learning.
arXiv Detail & Related papers (2021-07-27T16:39:45Z) - LazyDAgger: Reducing Context Switching in Interactive Imitation Learning [23.246687273191412]
We present LazyDAgger, which extends the interactive imitation learning (IL) algorithm SafeDAgger to reduce context switches between supervisor and autonomous control.
We find that LazyDAgger improves the performance and robustness of the learned policy during both learning and execution.
arXiv Detail & Related papers (2021-03-31T18:22:53Z) - Show Me What You Can Do: Capability Calibration on Reachable Workspace
for Human-Robot Collaboration [83.4081612443128]
We show that a short calibration using REMP can effectively bridge the gap between what a non-expert user thinks a robot can reach and the ground-truth.
We show that this calibration procedure not only results in better user perception, but also promotes more efficient human-robot collaborations.
arXiv Detail & Related papers (2021-03-06T09:14:30Z) - Human-Robot Team Coordination with Dynamic and Latent Human Task
Proficiencies: Scheduling with Learning Curves [0.0]
We introduce a novel resource coordination that enables robots to explore the relative strengths and learning abilities of their human teammates.
We generate and evaluate a robust schedule while discovering the latest individual worker proficiency.
Results indicate that scheduling strategies favoring exploration tend to be beneficial for human-robot collaboration.
arXiv Detail & Related papers (2020-07-03T19:44:22Z) - Thinking While Moving: Deep Reinforcement Learning with Concurrent
Control [122.49572467292293]
We study reinforcement learning in settings where sampling an action from the policy must be done concurrently with the time evolution of the controlled system.
Much like a person or an animal, the robot must think and move at the same time, deciding on its next action before the previous one has completed.
arXiv Detail & Related papers (2020-04-13T17:49:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.