Related papers: Training Robots to Evaluate Robots: Example-Based Interactive Reward Functions for Policy Learning

Training Robots to Evaluate Robots: Example-Based Interactive Reward Functions for Policy Learning

URL: http://arxiv.org/abs/2212.08961v1
Date: Sat, 17 Dec 2022 21:44:03 GMT
Title: Training Robots to Evaluate Robots: Example-Based Interactive Reward Functions for Policy Learning
Authors: Kun Huang, Edward S. Hu, Dinesh Jayaraman
Abstract summary: We propose to train robots to acquire such interactive behaviors automatically. These evaluations in turn serve as "interactive reward functions" (IRFs) IRFs can be conveniently trained using only examples of successful outcomes.
Score: 20.565163553170397
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Physical interactions can often help reveal information that is not readily apparent. For example, we may tug at a table leg to evaluate whether it is built well, or turn a water bottle upside down to check that it is watertight. We propose to train robots to acquire such interactive behaviors automatically, for the purpose of evaluating the result of an attempted robotic skill execution. These evaluations in turn serve as "interactive reward functions" (IRFs) for training reinforcement learning policies to perform the target skill, such as screwing the table leg tightly. In addition, even after task policies are fully trained, IRFs can serve as verification mechanisms that improve online task execution. For any given task, our IRFs can be conveniently trained using only examples of successful outcomes, and no further specification is needed to train the task policy thereafter. In our evaluations on door locking and weighted block stacking in simulation, and screw tightening on a real robot, IRFs enable large performance improvements, even outperforming baselines with access to demonstrations or carefully engineered rewards. Project website: https://sites.google.com/view/lirf-corl-2022/

Related papers

Real-World Reinforcement Learning of Active Perception Behaviors [27.56548234738969]
A robot's instantaneous sensory observations do not always reveal task-relevant state information.<n>We propose a simple real-world robot learning recipe to efficiently train active perception policies.<n>Our approach, asymmetric advantage weighted regression, exploits access to "privileged" extra sensors at training time.
arXiv Detail & Related papers (2025-12-01T02:05:20Z)
Solving Robotics Tasks with Prior Demonstration via Exploration-Efficient Deep Reinforcement Learning [0.688204255655161]
This paper proposes an exploration-efficient Deep Reinforcement Learning with Reference policy (DRLR) framework for learning robotics tasks that incorporates demonstrations.<n>The DRLR framework is developed based on an algorithm called Imitation Bootstrapped Reinforcement Learning (IBRL)
arXiv Detail & Related papers (2025-09-04T10:02:32Z)
FLaRe: Achieving Masterful and Adaptive Robot Policies with Large-Scale Reinforcement Learning Fine-Tuning [74.25049012472502]
FLaRe is a large-scale Reinforcement Learning framework that integrates robust pre-trained representations, large-scale training, and gradient stabilization techniques. Our method aligns pre-trained policies towards task completion, achieving state-of-the-art (SoTA) performance on previously demonstrated and on entirely novel tasks and embodiments.
arXiv Detail & Related papers (2024-09-25T03:15:17Z)
Learning Diverse Robot Striking Motions with Diffusion Models and Kinematically Constrained Gradient Guidance [0.3613661942047476]
We develop a novel diffusion modeling approach that is offline, constraint-guided, and expressive of diverse agile behaviors.<n>We demonstrate the effectiveness of our approach for time-critical robotic tasks by evaluating KCGG in two challenging domains: simulated air hockey and real table tennis.
arXiv Detail & Related papers (2024-09-23T20:26:51Z)
Affordance-Guided Reinforcement Learning via Visual Prompting [51.361977466993345]
Keypoint-based Affordance Guidance for Improvements (KAGI) is a method leveraging rewards shaped by vision-language models (VLMs) for autonomous RL. On real-world manipulation tasks specified by natural language descriptions, KAGI improves the sample efficiency of autonomous RL and enables successful task completion in 20K online fine-tuning steps.
arXiv Detail & Related papers (2024-07-14T21:41:29Z)
Robotic Control via Embodied Chain-of-Thought Reasoning [86.6680905262442]
Key limitation of learned robot control policies is their inability to generalize outside their training data. Recent works on vision-language-action models (VLAs) have shown that the use of large, internet pre-trained vision-language models can substantially improve their robustness and generalization ability. We introduce Embodied Chain-of-Thought Reasoning (ECoT) for VLAs, in which we train VLAs to perform multiple steps of reasoning about plans, sub-tasks, motions, and visually grounded features before predicting the robot action.
arXiv Detail & Related papers (2024-07-11T17:31:01Z)
Interactive Robot Learning from Verbal Correction [42.37176329867376]
OLAF allows users to teach a robot using verbal corrections when the robot makes mistakes. A key feature of OLAF is its ability to update the robot's visuomotor neural policy based on the verbal feedback. We demonstrate the efficacy of our design in experiments where a user teaches a robot to perform long-horizon manipulation tasks.
arXiv Detail & Related papers (2023-10-26T16:46:12Z)
Robot Fine-Tuning Made Easy: Pre-Training Rewards and Policies for Autonomous Real-World Reinforcement Learning [58.3994826169858]
We introduce RoboFuME, a reset-free fine-tuning system for robotic reinforcement learning. Our insights are to utilize offline reinforcement learning techniques to ensure efficient online fine-tuning of a pre-trained policy. Our method can incorporate data from an existing robot dataset and improve on a target task within as little as 3 hours of autonomous real-world experience.
arXiv Detail & Related papers (2023-10-23T17:50:08Z)
Self-Improving Robots: End-to-End Autonomous Visuomotor Reinforcement Learning [54.636562516974884]
In imitation and reinforcement learning, the cost of human supervision limits the amount of data that robots can be trained on. In this work, we propose MEDAL++, a novel design for self-improving robotic systems. The robot autonomously practices the task by learning to both do and undo the task, simultaneously inferring the reward function from the demonstrations.
arXiv Detail & Related papers (2023-03-02T18:51:38Z)
Variational Meta Reinforcement Learning for Social Robotics [15.754961709819938]
Social robotics still faces many challenges. One bottleneck is that robotic behaviors need to be often adapted as social norms depend strongly on the environment. In this work, we investigate meta-reinforcement learning (meta-RL) as a potential solution.
arXiv Detail & Related papers (2022-06-07T12:08:59Z)
A Framework for Efficient Robotic Manipulation [79.10407063260473]
We show that a single robotic arm can learn sparse-reward manipulation policies from pixels. We show that, given only 10 demonstrations, a single robotic arm can learn sparse-reward manipulation policies from pixels.
arXiv Detail & Related papers (2020-12-14T22:18:39Z)
Reinforcement Learning Experiments and Benchmark for Solving Robotic Reaching Tasks [0.0]
Reinforcement learning has been successfully applied to solving the reaching task with robotic arms. It is shown that augmenting the reward signal with the Hindsight Experience Replay exploration technique increases the average return of off-policy agents.
arXiv Detail & Related papers (2020-11-11T14:00:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.