Imagining In-distribution States: How Predictable Robot Behavior Can Enable User Control Over Learned Policies
- URL: http://arxiv.org/abs/2406.13711v1
- Date: Wed, 19 Jun 2024 17:08:28 GMT
- Title: Imagining In-distribution States: How Predictable Robot Behavior Can Enable User Control Over Learned Policies
- Authors: Isaac Sheidlower, Emma Bethel, Douglas Lilly, Reuben M. Aronson, Elaine Schaertl Short,
- Abstract summary: We present Imaginary Out-of-Distribution Actions, IODA, an algorithm which empowers users to leverage their expectations of a robot's behavior to accomplish new tasks.
IODA leads to both better task performance and a higher degree of alignment between robot behavior and user expectation.
- Score: 1.6078134198754157
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: It is crucial that users are empowered to take advantage of the functionality of a robot and use their understanding of that functionality to perform novel and creative tasks. Given a robot trained with Reinforcement Learning (RL), a user may wish to leverage that autonomy along with their familiarity of how they expect the robot to behave to collaborate with the robot. One technique is for the user to take control of some of the robot's action space through teleoperation, allowing the RL policy to simultaneously control the rest. We formalize this type of shared control as Partitioned Control (PC). However, this may not be possible using an out-of-the-box RL policy. For example, a user's control may bring the robot into a failure state from the policy's perspective, causing it to act unexpectedly and hindering the success of the user's desired task. In this work, we formalize this problem and present Imaginary Out-of-Distribution Actions, IODA, an initial algorithm which empowers users to leverage their expectations of a robot's behavior to accomplish new tasks. We deploy IODA in a user study with a real robot and find that IODA leads to both better task performance and a higher degree of alignment between robot behavior and user expectation. We also show that in PC, there is a strong and significant correlation between task performance and the robot's ability to meet user expectations, highlighting the need for approaches like IODA. Code is available at https://github.com/AABL-Lab/ioda_roman_2024
Related papers
- Robotic Control via Embodied Chain-of-Thought Reasoning [86.6680905262442]
Key limitation of learned robot control policies is their inability to generalize outside their training data.
Recent works on vision-language-action models (VLAs) have shown that the use of large, internet pre-trained vision-language models can substantially improve their robustness and generalization ability.
We introduce Embodied Chain-of-Thought Reasoning (ECoT) for VLAs, in which we train VLAs to perform multiple steps of reasoning about plans, sub-tasks, motions, and visually grounded features before predicting the robot action.
arXiv Detail & Related papers (2024-07-11T17:31:01Z) - RoboScript: Code Generation for Free-Form Manipulation Tasks across Real
and Simulation [77.41969287400977]
This paper presents textbfRobotScript, a platform for a deployable robot manipulation pipeline powered by code generation.
We also present a benchmark for a code generation benchmark for robot manipulation tasks in free-form natural language.
We demonstrate the adaptability of our code generation framework across multiple robot embodiments, including the Franka and UR5 robot arms.
arXiv Detail & Related papers (2024-02-22T15:12:00Z) - Modifying RL Policies with Imagined Actions: How Predictable Policies
Can Enable Users to Perform Novel Tasks [0.0]
A user who has access to a Reinforcement Learning based robot may want to use the robot's autonomy and their knowledge of its behavior to complete new tasks.
One way is for the user to take control of some of the robot's action space through teleoperation while the RL policy simultaneously controls the rest.
In this work, we formalize this problem and present Imaginary Out-of-Distribution Actions, IODA, an initial algorithm for addressing that problem and empowering user's to leverage their expectation of a robot's behavior to accomplish new tasks.
arXiv Detail & Related papers (2023-12-10T20:40:45Z) - Interactive Robot Learning from Verbal Correction [42.37176329867376]
OLAF allows users to teach a robot using verbal corrections when the robot makes mistakes.
A key feature of OLAF is its ability to update the robot's visuomotor neural policy based on the verbal feedback.
We demonstrate the efficacy of our design in experiments where a user teaches a robot to perform long-horizon manipulation tasks.
arXiv Detail & Related papers (2023-10-26T16:46:12Z) - ImitationNet: Unsupervised Human-to-Robot Motion Retargeting via Shared Latent Space [9.806227900768926]
This paper introduces a novel deep-learning approach for human-to-robot motion.
Our method does not require paired human-to-robot data, which facilitates its translation to new robots.
Our model outperforms existing works regarding human-to-robot similarity in terms of efficiency and precision.
arXiv Detail & Related papers (2023-09-11T08:55:04Z) - Using Knowledge Representation and Task Planning for Robot-agnostic
Skills on the Example of Contact-Rich Wiping Tasks [44.99833362998488]
We show how a single robot skill that utilizes knowledge representation, task planning, and automatic selection of skill implementations can be executed in different contexts.
We demonstrate how the skill-based control platform enables this with contact-rich wiping tasks on different robot systems.
arXiv Detail & Related papers (2023-08-27T21:17:32Z) - Exploring AI-enhanced Shared Control for an Assistive Robotic Arm [4.999814847776098]
In particular, we explore how Artifical Intelligence (AI) can be integrated into a shared control paradigm.
In particular, we focus on the consequential requirements for the interface between human and robot.
arXiv Detail & Related papers (2023-06-23T14:19:56Z) - Self-Improving Robots: End-to-End Autonomous Visuomotor Reinforcement
Learning [54.636562516974884]
In imitation and reinforcement learning, the cost of human supervision limits the amount of data that robots can be trained on.
In this work, we propose MEDAL++, a novel design for self-improving robotic systems.
The robot autonomously practices the task by learning to both do and undo the task, simultaneously inferring the reward function from the demonstrations.
arXiv Detail & Related papers (2023-03-02T18:51:38Z) - Model Predictive Control for Fluid Human-to-Robot Handovers [50.72520769938633]
Planning motions that take human comfort into account is not a part of the human-robot handover process.
We propose to generate smooth motions via an efficient model-predictive control framework.
We conduct human-to-robot handover experiments on a diverse set of objects with several users.
arXiv Detail & Related papers (2022-03-31T23:08:20Z) - Show Me What You Can Do: Capability Calibration on Reachable Workspace
for Human-Robot Collaboration [83.4081612443128]
We show that a short calibration using REMP can effectively bridge the gap between what a non-expert user thinks a robot can reach and the ground-truth.
We show that this calibration procedure not only results in better user perception, but also promotes more efficient human-robot collaborations.
arXiv Detail & Related papers (2021-03-06T09:14:30Z) - Joint Mind Modeling for Explanation Generation in Complex Human-Robot
Collaborative Tasks [83.37025218216888]
We propose a novel explainable AI (XAI) framework for achieving human-like communication in human-robot collaborations.
The robot builds a hierarchical mind model of the human user and generates explanations of its own mind as a form of communications.
Results show that the generated explanations of our approach significantly improves the collaboration performance and user perception of the robot.
arXiv Detail & Related papers (2020-07-24T23:35:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.