Imagining In-distribution States: How Predictable Robot Behavior Can Enable User Control Over Learned Policies
- URL: http://arxiv.org/abs/2406.13711v1
- Date: Wed, 19 Jun 2024 17:08:28 GMT
- Title: Imagining In-distribution States: How Predictable Robot Behavior Can Enable User Control Over Learned Policies
- Authors: Isaac Sheidlower, Emma Bethel, Douglas Lilly, Reuben M. Aronson, Elaine Schaertl Short,
- Abstract summary: We present Imaginary Out-of-Distribution Actions, IODA, an algorithm which empowers users to leverage their expectations of a robot's behavior to accomplish new tasks.
IODA leads to both better task performance and a higher degree of alignment between robot behavior and user expectation.
- Score: 1.6078134198754157
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: It is crucial that users are empowered to take advantage of the functionality of a robot and use their understanding of that functionality to perform novel and creative tasks. Given a robot trained with Reinforcement Learning (RL), a user may wish to leverage that autonomy along with their familiarity of how they expect the robot to behave to collaborate with the robot. One technique is for the user to take control of some of the robot's action space through teleoperation, allowing the RL policy to simultaneously control the rest. We formalize this type of shared control as Partitioned Control (PC). However, this may not be possible using an out-of-the-box RL policy. For example, a user's control may bring the robot into a failure state from the policy's perspective, causing it to act unexpectedly and hindering the success of the user's desired task. In this work, we formalize this problem and present Imaginary Out-of-Distribution Actions, IODA, an initial algorithm which empowers users to leverage their expectations of a robot's behavior to accomplish new tasks. We deploy IODA in a user study with a real robot and find that IODA leads to both better task performance and a higher degree of alignment between robot behavior and user expectation. We also show that in PC, there is a strong and significant correlation between task performance and the robot's ability to meet user expectations, highlighting the need for approaches like IODA. Code is available at https://github.com/AABL-Lab/ioda_roman_2024
Related papers
- $π_0$: A Vision-Language-Action Flow Model for General Robot Control [77.32743739202543]
We propose a novel flow matching architecture built on top of a pre-trained vision-language model (VLM) to inherit Internet-scale semantic knowledge.
We evaluate our model in terms of its ability to perform tasks in zero shot after pre-training, follow language instructions from people, and its ability to acquire new skills via fine-tuning.
arXiv Detail & Related papers (2024-10-31T17:22:30Z) - Know your limits! Optimize the robot's behavior through self-awareness [11.021217430606042]
Recent human-robot imitation algorithms focus on following a reference human motion with high precision.
We introduce a deep-learning model that anticipates the robot's performance when imitating a given reference.
Our Self-AWare model (SAW) ranks potential robot behaviors based on various criteria, such as fall likelihood, adherence to the reference motion, and smoothness.
arXiv Detail & Related papers (2024-09-16T14:14:58Z) - Online Behavior Modification for Expressive User Control of RL-Trained Robots [1.6078134198754157]
Online behavior modification is a paradigm in which users have control over behavior features of a robot in real time as it autonomously completes a task using an RL-trained policy.
We present a behavior diversity based algorithm, Adjustable Control Of RL Dynamics (ACORD), and demonstrate its applicability to online behavior modification in simulation and a user study.
arXiv Detail & Related papers (2024-08-15T12:28:08Z) - Robotic Control via Embodied Chain-of-Thought Reasoning [86.6680905262442]
Key limitation of learned robot control policies is their inability to generalize outside their training data.
Recent works on vision-language-action models (VLAs) have shown that the use of large, internet pre-trained vision-language models can substantially improve their robustness and generalization ability.
We introduce Embodied Chain-of-Thought Reasoning (ECoT) for VLAs, in which we train VLAs to perform multiple steps of reasoning about plans, sub-tasks, motions, and visually grounded features before predicting the robot action.
arXiv Detail & Related papers (2024-07-11T17:31:01Z) - RoboScript: Code Generation for Free-Form Manipulation Tasks across Real
and Simulation [77.41969287400977]
This paper presents textbfRobotScript, a platform for a deployable robot manipulation pipeline powered by code generation.
We also present a benchmark for a code generation benchmark for robot manipulation tasks in free-form natural language.
We demonstrate the adaptability of our code generation framework across multiple robot embodiments, including the Franka and UR5 robot arms.
arXiv Detail & Related papers (2024-02-22T15:12:00Z) - Modifying RL Policies with Imagined Actions: How Predictable Policies
Can Enable Users to Perform Novel Tasks [0.0]
A user who has access to a Reinforcement Learning based robot may want to use the robot's autonomy and their knowledge of its behavior to complete new tasks.
One way is for the user to take control of some of the robot's action space through teleoperation while the RL policy simultaneously controls the rest.
In this work, we formalize this problem and present Imaginary Out-of-Distribution Actions, IODA, an initial algorithm for addressing that problem and empowering user's to leverage their expectation of a robot's behavior to accomplish new tasks.
arXiv Detail & Related papers (2023-12-10T20:40:45Z) - ImitationNet: Unsupervised Human-to-Robot Motion Retargeting via Shared Latent Space [9.806227900768926]
This paper introduces a novel deep-learning approach for human-to-robot motion.
Our method does not require paired human-to-robot data, which facilitates its translation to new robots.
Our model outperforms existing works regarding human-to-robot similarity in terms of efficiency and precision.
arXiv Detail & Related papers (2023-09-11T08:55:04Z) - Exploring AI-enhanced Shared Control for an Assistive Robotic Arm [4.999814847776098]
In particular, we explore how Artifical Intelligence (AI) can be integrated into a shared control paradigm.
In particular, we focus on the consequential requirements for the interface between human and robot.
arXiv Detail & Related papers (2023-06-23T14:19:56Z) - Self-Improving Robots: End-to-End Autonomous Visuomotor Reinforcement
Learning [54.636562516974884]
In imitation and reinforcement learning, the cost of human supervision limits the amount of data that robots can be trained on.
In this work, we propose MEDAL++, a novel design for self-improving robotic systems.
The robot autonomously practices the task by learning to both do and undo the task, simultaneously inferring the reward function from the demonstrations.
arXiv Detail & Related papers (2023-03-02T18:51:38Z) - Model Predictive Control for Fluid Human-to-Robot Handovers [50.72520769938633]
Planning motions that take human comfort into account is not a part of the human-robot handover process.
We propose to generate smooth motions via an efficient model-predictive control framework.
We conduct human-to-robot handover experiments on a diverse set of objects with several users.
arXiv Detail & Related papers (2022-03-31T23:08:20Z) - Joint Mind Modeling for Explanation Generation in Complex Human-Robot
Collaborative Tasks [83.37025218216888]
We propose a novel explainable AI (XAI) framework for achieving human-like communication in human-robot collaborations.
The robot builds a hierarchical mind model of the human user and generates explanations of its own mind as a form of communications.
Results show that the generated explanations of our approach significantly improves the collaboration performance and user perception of the robot.
arXiv Detail & Related papers (2020-07-24T23:35:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.