LazyDAgger: Reducing Context Switching in Interactive Imitation Learning
- URL: http://arxiv.org/abs/2104.00053v1
- Date: Wed, 31 Mar 2021 18:22:53 GMT
- Title: LazyDAgger: Reducing Context Switching in Interactive Imitation Learning
- Authors: Ryan Hoque, Ashwin Balakrishna, Carl Putterman, Michael Luo, Daniel S.
Brown, Daniel Seita, Brijen Thananjeyan, Ellen Novoseller, Ken Goldberg
- Abstract summary: We present LazyDAgger, which extends the interactive imitation learning (IL) algorithm SafeDAgger to reduce context switches between supervisor and autonomous control.
We find that LazyDAgger improves the performance and robustness of the learned policy during both learning and execution.
- Score: 23.246687273191412
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Corrective interventions while a robot is learning to automate a task provide
an intuitive method for a human supervisor to assist the robot and convey
information about desired behavior. However, these interventions can impose
significant burden on a human supervisor, as each intervention interrupts other
work the human is doing, incurs latency with each context switch between
supervisor and autonomous control, and requires time to perform. We present
LazyDAgger, which extends the interactive imitation learning (IL) algorithm
SafeDAgger to reduce context switches between supervisor and autonomous
control. We find that LazyDAgger improves the performance and robustness of the
learned policy during both learning and execution while limiting burden on the
supervisor. Simulation experiments suggest that LazyDAgger can reduce context
switches by an average of 60% over SafeDAgger on 3 continuous control tasks
while maintaining state-of-the-art policy performance. In physical fabric
manipulation experiments with an ABB YuMi robot, LazyDAgger reduces context
switches by 60% while achieving a 60% higher success rate than SafeDAgger at
execution time.
Related papers
- Teaching Robots to Handle Nuclear Waste: A Teleoperation-Based Learning Approach< [8.587182001055448]
The proposed framework addresses challenges in nuclear waste handling tasks, which often involve repetitive and meticulous manipulation operations.
By capturing operator movements and manipulation forces during teleoperation, the framework utilizes this data to train machine learning models capable of replicating and generalizing human skills.
arXiv Detail & Related papers (2025-04-02T06:46:29Z) - Human-Agent Joint Learning for Efficient Robot Manipulation Skill Acquisition [48.65867987106428]
We introduce a novel system for joint learning between human operators and robots.
It enables human operators to share control of a robot end-effector with a learned assistive agent.
It reduces the need for human adaptation while ensuring the collected data is of sufficient quality for downstream tasks.
arXiv Detail & Related papers (2024-06-29T03:37:29Z) - Adaptive Manipulation using Behavior Trees [12.061325774210392]
We present the adaptive behavior tree, which enables a robot to quickly adapt to both visual and non-visual observations during task execution.
We test our approach on a number of tasks commonly found in industrial settings.
arXiv Detail & Related papers (2024-06-20T18:01:36Z) - Exploring of Discrete and Continuous Input Control for AI-enhanced
Assistive Robotic Arms [5.371337604556312]
Collaborative robots require users to manage multiple Degrees-of-Freedom (DoFs) for tasks like grasping and manipulating objects.
This study explores three different input devices by integrating them into an established XR framework for assistive robotics.
arXiv Detail & Related papers (2024-01-13T16:57:40Z) - Self-Improving Robots: End-to-End Autonomous Visuomotor Reinforcement
Learning [54.636562516974884]
In imitation and reinforcement learning, the cost of human supervision limits the amount of data that robots can be trained on.
In this work, we propose MEDAL++, a novel design for self-improving robotic systems.
The robot autonomously practices the task by learning to both do and undo the task, simultaneously inferring the reward function from the demonstrations.
arXiv Detail & Related papers (2023-03-02T18:51:38Z) - "No, to the Right" -- Online Language Corrections for Robotic
Manipulation via Shared Autonomy [70.45420918526926]
We present LILAC, a framework for incorporating and adapting to natural language corrections online during execution.
Instead of discrete turn-taking between a human and robot, LILAC splits agency between the human and robot.
We show that our corrections-aware approach obtains higher task completion rates, and is subjectively preferred by users.
arXiv Detail & Related papers (2023-01-06T15:03:27Z) - ThriftyDAgger: Budget-Aware Novelty and Risk Gating for Interactive
Imitation Learning [23.177329496817105]
ThriftyDAgger is an algorithm for querying a human supervisor given a desired budget of human interventions.
Experiments suggest that ThriftyDAgger's intervention criteria balances task performance and supervisor burden more effectively than prior algorithms.
arXiv Detail & Related papers (2021-09-17T01:21:16Z) - Behavior Self-Organization Supports Task Inference for Continual Robot
Learning [18.071689266826212]
We propose a novel approach to continual learning of robotic control tasks.
Our approach performs unsupervised learning of behavior embeddings by incrementally self-organizing demonstrated behaviors.
Unlike previous approaches, our approach makes no assumptions about task distribution and requires no task exploration to infer tasks.
arXiv Detail & Related papers (2021-07-09T16:37:27Z) - Deep Reinforcement Learning for Haptic Shared Control in Unknown Tasks [1.0635248457021496]
Haptic shared control (HSC) is an alternative to direct teleoperation in teleoperated systems.
The application of virtual guiding forces decreases the user's control effort and improves execution time in various tasks.
The challenge lies in developing controllers to provide the optimal guiding forces for the tasks that are being performed.
This work addresses this challenge by designing a controller based on the deep deterministic policy gradient (DDPG) algorithm to provide the assistance, and a convolutional neural network (CNN) to perform the task detection.
arXiv Detail & Related papers (2021-01-15T17:27:38Z) - Human-in-the-Loop Imitation Learning using Remote Teleoperation [72.2847988686463]
We build a data collection system tailored to 6-DoF manipulation settings.
We develop an algorithm to train the policy iteratively on new data collected by the system.
We demonstrate that agents trained on data collected by our intervention-based system and algorithm outperform agents trained on an equivalent number of samples collected by non-interventional demonstrators.
arXiv Detail & Related papers (2020-12-12T05:30:35Z) - Thinking While Moving: Deep Reinforcement Learning with Concurrent
Control [122.49572467292293]
We study reinforcement learning in settings where sampling an action from the policy must be done concurrently with the time evolution of the controlled system.
Much like a person or an animal, the robot must think and move at the same time, deciding on its next action before the previous one has completed.
arXiv Detail & Related papers (2020-04-13T17:49:29Z) - Scalable Multi-Task Imitation Learning with Autonomous Improvement [159.9406205002599]
We build an imitation learning system that can continuously improve through autonomous data collection.
We leverage the robot's own trials as demonstrations for tasks other than the one that the robot actually attempted.
In contrast to prior imitation learning approaches, our method can autonomously collect data with sparse supervision for continuous improvement.
arXiv Detail & Related papers (2020-02-25T18:56:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.