Modifying RL Policies with Imagined Actions: How Predictable Policies
Can Enable Users to Perform Novel Tasks
- URL: http://arxiv.org/abs/2312.05991v1
- Date: Sun, 10 Dec 2023 20:40:45 GMT
- Title: Modifying RL Policies with Imagined Actions: How Predictable Policies
Can Enable Users to Perform Novel Tasks
- Authors: Isaac Sheidlower, Reuben Aronson, Elaine Short
- Abstract summary: A user who has access to a Reinforcement Learning based robot may want to use the robot's autonomy and their knowledge of its behavior to complete new tasks.
One way is for the user to take control of some of the robot's action space through teleoperation while the RL policy simultaneously controls the rest.
In this work, we formalize this problem and present Imaginary Out-of-Distribution Actions, IODA, an initial algorithm for addressing that problem and empowering user's to leverage their expectation of a robot's behavior to accomplish new tasks.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: It is crucial that users are empowered to use the functionalities of a robot
to creatively solve problems on the fly. A user who has access to a
Reinforcement Learning (RL) based robot may want to use the robot's autonomy
and their knowledge of its behavior to complete new tasks. One way is for the
user to take control of some of the robot's action space through teleoperation
while the RL policy simultaneously controls the rest. However, an
out-of-the-box RL policy may not readily facilitate this. For example, a user's
control may bring the robot into a failure state from the policy's perspective,
causing it to act in a way the user is not familiar with, hindering the success
of the user's desired task. In this work, we formalize this problem and present
Imaginary Out-of-Distribution Actions, IODA, an initial algorithm for
addressing that problem and empowering user's to leverage their expectation of
a robot's behavior to accomplish new tasks.
Related papers
- Online Behavior Modification for Expressive User Control of RL-Trained Robots [1.6078134198754157]
Online behavior modification is a paradigm in which users have control over behavior features of a robot in real time as it autonomously completes a task using an RL-trained policy.
We present a behavior diversity based algorithm, Adjustable Control Of RL Dynamics (ACORD), and demonstrate its applicability to online behavior modification in simulation and a user study.
arXiv Detail & Related papers (2024-08-15T12:28:08Z) - Robotic Control via Embodied Chain-of-Thought Reasoning [86.6680905262442]
Key limitation of learned robot control policies is their inability to generalize outside their training data.
Recent works on vision-language-action models (VLAs) have shown that the use of large, internet pre-trained vision-language models can substantially improve their robustness and generalization ability.
We introduce Embodied Chain-of-Thought Reasoning (ECoT) for VLAs, in which we train VLAs to perform multiple steps of reasoning about plans, sub-tasks, motions, and visually grounded features before predicting the robot action.
arXiv Detail & Related papers (2024-07-11T17:31:01Z) - Imagining In-distribution States: How Predictable Robot Behavior Can Enable User Control Over Learned Policies [1.6078134198754157]
We present Imaginary Out-of-Distribution Actions, IODA, an algorithm which empowers users to leverage their expectations of a robot's behavior to accomplish new tasks.
IODA leads to both better task performance and a higher degree of alignment between robot behavior and user expectation.
arXiv Detail & Related papers (2024-06-19T17:08:28Z) - Interactive Robot Learning from Verbal Correction [42.37176329867376]
OLAF allows users to teach a robot using verbal corrections when the robot makes mistakes.
A key feature of OLAF is its ability to update the robot's visuomotor neural policy based on the verbal feedback.
We demonstrate the efficacy of our design in experiments where a user teaches a robot to perform long-horizon manipulation tasks.
arXiv Detail & Related papers (2023-10-26T16:46:12Z) - Stabilizing Contrastive RL: Techniques for Robotic Goal Reaching from
Offline Data [101.43350024175157]
Self-supervised learning has the potential to decrease the amount of human annotation and engineering effort required to learn control strategies.
Our work builds on prior work showing that the reinforcement learning (RL) itself can be cast as a self-supervised problem.
We demonstrate that a self-supervised RL algorithm based on contrastive learning can solve real-world, image-based robotic manipulation tasks.
arXiv Detail & Related papers (2023-06-06T01:36:56Z) - Dexterous Manipulation from Images: Autonomous Real-World RL via Substep
Guidance [71.36749876465618]
We describe a system for vision-based dexterous manipulation that provides a "programming-free" approach for users to define new tasks.
Our system includes a framework for users to define a final task and intermediate sub-tasks with image examples.
experimental results with a four-finger robotic hand learning multi-stage object manipulation tasks directly in the real world.
arXiv Detail & Related papers (2022-12-19T22:50:40Z) - Verifying Learning-Based Robotic Navigation Systems [61.01217374879221]
We show how modern verification engines can be used for effective model selection.
Specifically, we use verification to detect and rule out policies that may demonstrate suboptimal behavior.
Our work is the first to demonstrate the use of verification backends for recognizing suboptimal DRL policies in real-world robots.
arXiv Detail & Related papers (2022-05-26T17:56:43Z) - ASHA: Assistive Teleoperation via Human-in-the-Loop Reinforcement
Learning [91.58711082348293]
Reinforcement learning from online user feedback on the system's performance presents a natural solution to this problem.
This approach tends to require a large amount of human-in-the-loop training data, especially when feedback is sparse.
We propose a hierarchical solution that learns efficiently from sparse user feedback.
arXiv Detail & Related papers (2022-02-05T02:01:19Z) - Lifelong Robotic Reinforcement Learning by Retaining Experiences [61.79346922421323]
Many multi-task reinforcement learning efforts assume the robot can collect data from all tasks at all times.
In this work, we study a practical sequential multi-task RL problem motivated by the practical constraints of physical robotic systems.
We derive an approach that effectively leverages the data and policies learned for previous tasks to cumulatively grow the robot's skill-set.
arXiv Detail & Related papers (2021-09-19T18:00:51Z) - SQUIRL: Robust and Efficient Learning from Video Demonstration of
Long-Horizon Robotic Manipulation Tasks [8.756012472587601]
Deep reinforcement learning (RL) can be used to learn complex manipulation tasks.
RL requires the robot to collect a large amount of real-world experience.
S SQUIRL performs a new but related long-horizon task robustly given only a single video demonstration.
arXiv Detail & Related papers (2020-03-10T20:26:26Z) - Learning Force Control for Contact-rich Manipulation Tasks with Rigid
Position-controlled Robots [9.815369993136512]
We propose a learning-based force control framework combining RL techniques with traditional force control.
Within said control scheme, we implemented two different conventional approaches to achieve force control with position-controlled robots.
Finally, we developed a fail-safe mechanism for safely training an RL agent on manipulation tasks using a real rigid robot manipulator.
arXiv Detail & Related papers (2020-03-02T01:58:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.