Online Behavior Modification for Expressive User Control of RL-Trained Robots
- URL: http://arxiv.org/abs/2408.16776v1
- Date: Thu, 15 Aug 2024 12:28:08 GMT
- Title: Online Behavior Modification for Expressive User Control of RL-Trained Robots
- Authors: Isaac Sheidlower, Mavis Murdock, Emma Bethel, Reuben M. Aronson, Elaine Schaertl Short,
- Abstract summary: Online behavior modification is a paradigm in which users have control over behavior features of a robot in real time as it autonomously completes a task using an RL-trained policy.
We present a behavior diversity based algorithm, Adjustable Control Of RL Dynamics (ACORD), and demonstrate its applicability to online behavior modification in simulation and a user study.
- Score: 1.6078134198754157
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: Reinforcement Learning (RL) is an effective method for robots to learn tasks. However, in typical RL, end-users have little to no control over how the robot does the task after the robot has been deployed. To address this, we introduce the idea of online behavior modification, a paradigm in which users have control over behavior features of a robot in real time as it autonomously completes a task using an RL-trained policy. To show the value of this user-centered formulation for human-robot interaction, we present a behavior diversity based algorithm, Adjustable Control Of RL Dynamics (ACORD), and demonstrate its applicability to online behavior modification in simulation and a user study. In the study (n=23) users adjust the style of paintings as a robot traces a shape autonomously. We compare ACORD to RL and Shared Autonomy (SA), and show ACORD affords user-preferred levels of control and expression, comparable to SA, but with the potential for autonomous execution and robustness of RL.
Related papers
- Imagining In-distribution States: How Predictable Robot Behavior Can Enable User Control Over Learned Policies [1.6078134198754157]
We present Imaginary Out-of-Distribution Actions, IODA, an algorithm which empowers users to leverage their expectations of a robot's behavior to accomplish new tasks.
IODA leads to both better task performance and a higher degree of alignment between robot behavior and user expectation.
arXiv Detail & Related papers (2024-06-19T17:08:28Z) - Reinforcement Learning for Versatile, Dynamic, and Robust Bipedal Locomotion Control [106.32794844077534]
This paper presents a study on using deep reinforcement learning to create dynamic locomotion controllers for bipedal robots.
We develop a general control solution that can be used for a range of dynamic bipedal skills, from periodic walking and running to aperiodic jumping and standing.
This work pushes the limits of agility for bipedal robots through extensive real-world experiments.
arXiv Detail & Related papers (2024-01-30T10:48:43Z) - Modifying RL Policies with Imagined Actions: How Predictable Policies
Can Enable Users to Perform Novel Tasks [0.0]
A user who has access to a Reinforcement Learning based robot may want to use the robot's autonomy and their knowledge of its behavior to complete new tasks.
One way is for the user to take control of some of the robot's action space through teleoperation while the RL policy simultaneously controls the rest.
In this work, we formalize this problem and present Imaginary Out-of-Distribution Actions, IODA, an initial algorithm for addressing that problem and empowering user's to leverage their expectation of a robot's behavior to accomplish new tasks.
arXiv Detail & Related papers (2023-12-10T20:40:45Z) - Grow Your Limits: Continuous Improvement with Real-World RL for Robotic
Locomotion [66.69666636971922]
We present APRL, a policy regularization framework that modulates the robot's exploration over the course of training.
APRL enables a quadrupedal robot to efficiently learn to walk entirely in the real world within minutes.
arXiv Detail & Related papers (2023-10-26T17:51:46Z) - Robot Fine-Tuning Made Easy: Pre-Training Rewards and Policies for
Autonomous Real-World Reinforcement Learning [58.3994826169858]
We introduce RoboFuME, a reset-free fine-tuning system for robotic reinforcement learning.
Our insights are to utilize offline reinforcement learning techniques to ensure efficient online fine-tuning of a pre-trained policy.
Our method can incorporate data from an existing robot dataset and improve on a target task within as little as 3 hours of autonomous real-world experience.
arXiv Detail & Related papers (2023-10-23T17:50:08Z) - Combining model-predictive control and predictive reinforcement learning
for stable quadrupedal robot locomotion [0.0]
We study how this can be achieved by a combination of model-predictive and predictive reinforcement learning controllers.
In this work, we combine both control methods to address the quadrupedal robot stable gate generation problem.
arXiv Detail & Related papers (2023-07-15T09:22:37Z) - Stabilizing Contrastive RL: Techniques for Robotic Goal Reaching from
Offline Data [101.43350024175157]
Self-supervised learning has the potential to decrease the amount of human annotation and engineering effort required to learn control strategies.
Our work builds on prior work showing that the reinforcement learning (RL) itself can be cast as a self-supervised problem.
We demonstrate that a self-supervised RL algorithm based on contrastive learning can solve real-world, image-based robotic manipulation tasks.
arXiv Detail & Related papers (2023-06-06T01:36:56Z) - Human-AI Shared Control via Frequency-based Policy Dissection [34.0399894373716]
Human-AI shared control allows human to interact and collaborate with AI to accomplish control tasks in complex environments.
Previous Reinforcement Learning (RL) methods attempt the goal-conditioned design to achieve human-controllable policies.
We develop a simple yet effective frequency-based approach called textitPolicy Dissection to align the intermediate representation of the learned neural controller with the kinematic attributes of the agent behavior.
arXiv Detail & Related papers (2022-05-31T23:57:55Z) - Verifying Learning-Based Robotic Navigation Systems [61.01217374879221]
We show how modern verification engines can be used for effective model selection.
Specifically, we use verification to detect and rule out policies that may demonstrate suboptimal behavior.
Our work is the first to demonstrate the use of verification backends for recognizing suboptimal DRL policies in real-world robots.
arXiv Detail & Related papers (2022-05-26T17:56:43Z) - Accelerating Robotic Reinforcement Learning via Parameterized Action
Primitives [92.0321404272942]
Reinforcement learning can be used to build general-purpose robotic systems.
However, training RL agents to solve robotics tasks still remains challenging.
In this work, we manually specify a library of robot action primitives (RAPS), parameterized with arguments that are learned by an RL policy.
We find that our simple change to the action interface substantially improves both the learning efficiency and task performance.
arXiv Detail & Related papers (2021-10-28T17:59:30Z) - Learning Force Control for Contact-rich Manipulation Tasks with Rigid
Position-controlled Robots [9.815369993136512]
We propose a learning-based force control framework combining RL techniques with traditional force control.
Within said control scheme, we implemented two different conventional approaches to achieve force control with position-controlled robots.
Finally, we developed a fail-safe mechanism for safely training an RL agent on manipulation tasks using a real rigid robot manipulator.
arXiv Detail & Related papers (2020-03-02T01:58:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.