Interactive Robot Learning from Verbal Correction
- URL: http://arxiv.org/abs/2310.17555v1
- Date: Thu, 26 Oct 2023 16:46:12 GMT
- Title: Interactive Robot Learning from Verbal Correction
- Authors: Huihan Liu, Alice Chen, Yuke Zhu, Adith Swaminathan, Andrey Kolobov,
Ching-An Cheng
- Abstract summary: OLAF allows users to teach a robot using verbal corrections when the robot makes mistakes.
A key feature of OLAF is its ability to update the robot's visuomotor neural policy based on the verbal feedback.
We demonstrate the efficacy of our design in experiments where a user teaches a robot to perform long-horizon manipulation tasks.
- Score: 42.37176329867376
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The ability to learn and refine behavior after deployment has become ever
more important for robots as we design them to operate in unstructured
environments like households. In this work, we design a new learning system
based on large language model (LLM), OLAF, that allows everyday users to teach
a robot using verbal corrections when the robot makes mistakes, e.g., by saying
"Stop what you're doing. You should move closer to the cup." A key feature of
OLAF is its ability to update the robot's visuomotor neural policy based on the
verbal feedback to avoid repeating mistakes in the future. This is in contrast
to existing LLM-based robotic systems, which only follow verbal commands or
corrections but not learn from them. We demonstrate the efficacy of our design
in experiments where a user teaches a robot to perform long-horizon
manipulation tasks both in simulation and on physical hardware, achieving on
average 20.0% improvement in policy success rate. Videos and more results are
at https://ut-austin-rpl.github.io/olaf/
Related papers
- $π_0$: A Vision-Language-Action Flow Model for General Robot Control [77.32743739202543]
We propose a novel flow matching architecture built on top of a pre-trained vision-language model (VLM) to inherit Internet-scale semantic knowledge.
We evaluate our model in terms of its ability to perform tasks in zero shot after pre-training, follow language instructions from people, and its ability to acquire new skills via fine-tuning.
arXiv Detail & Related papers (2024-10-31T17:22:30Z) - Human-Robot Mutual Learning through Affective-Linguistic Interaction and Differential Outcomes Training [Pre-Print] [0.3811184252495269]
We test how affective-linguistic communication, in combination with differential outcomes training, affects mutual learning in a human-robot context.
Taking inspiration from child- caregiver dynamics, our human-robot interaction setup consists of a (simulated) robot attempting to learn how best to communicate internal, homeostatically-controlled needs.
arXiv Detail & Related papers (2024-07-01T13:35:08Z) - Robot Fine-Tuning Made Easy: Pre-Training Rewards and Policies for
Autonomous Real-World Reinforcement Learning [58.3994826169858]
We introduce RoboFuME, a reset-free fine-tuning system for robotic reinforcement learning.
Our insights are to utilize offline reinforcement learning techniques to ensure efficient online fine-tuning of a pre-trained policy.
Our method can incorporate data from an existing robot dataset and improve on a target task within as little as 3 hours of autonomous real-world experience.
arXiv Detail & Related papers (2023-10-23T17:50:08Z) - Incremental Learning of Humanoid Robot Behavior from Natural Interaction and Large Language Models [23.945922720555146]
We propose a system to achieve incremental learning of complex behavior from natural interaction.
We integrate the system in the robot cognitive architecture of the humanoid robot ARMAR-6.
arXiv Detail & Related papers (2023-09-08T13:29:05Z) - Quality-Diversity Optimisation on a Physical Robot Through
Dynamics-Aware and Reset-Free Learning [4.260312058817663]
We build upon the Reset-Free QD (RF-QD) algorithm to learn controllers directly on a physical robot.
This method uses a dynamics model, learned from interactions between the robot and the environment, to predict the robot's behaviour.
RF-QD also includes a recovery policy that returns the robot to a safe zone when it has walked outside of it, allowing continuous learning.
arXiv Detail & Related papers (2023-04-24T13:24:00Z) - Self-Improving Robots: End-to-End Autonomous Visuomotor Reinforcement
Learning [54.636562516974884]
In imitation and reinforcement learning, the cost of human supervision limits the amount of data that robots can be trained on.
In this work, we propose MEDAL++, a novel design for self-improving robotic systems.
The robot autonomously practices the task by learning to both do and undo the task, simultaneously inferring the reward function from the demonstrations.
arXiv Detail & Related papers (2023-03-02T18:51:38Z) - Fleet-DAgger: Interactive Robot Fleet Learning with Scalable Human
Supervision [72.4735163268491]
Commercial and industrial deployments of robot fleets often fall back on remote human teleoperators during execution.
We formalize the Interactive Fleet Learning (IFL) setting, in which multiple robots interactively query and learn from multiple human supervisors.
We propose Fleet-DAgger, a family of IFL algorithms, and compare a novel Fleet-DAgger algorithm to 4 baselines in simulation.
arXiv Detail & Related papers (2022-06-29T01:23:57Z) - Revisiting the Adversarial Robustness-Accuracy Tradeoff in Robot
Learning [121.9708998627352]
Recent work has shown that, in practical robot learning applications, the effects of adversarial training do not pose a fair trade-off.
This work revisits the robustness-accuracy trade-off in robot learning by analyzing if recent advances in robust training methods and theory can make adversarial training suitable for real-world robot applications.
arXiv Detail & Related papers (2022-04-15T08:12:15Z) - Back to Reality for Imitation Learning [8.57914821832517]
Imitation learning, and robot learning in general, emerged due to breakthroughs in machine learning, rather than breakthroughs in robotics.
We believe that a better metric for real-world robot learning is time efficiency, which better models the true cost to humans.
arXiv Detail & Related papers (2021-11-25T02:03:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.