Training Robots to Evaluate Robots: Example-Based Interactive Reward
Functions for Policy Learning
- URL: http://arxiv.org/abs/2212.08961v1
- Date: Sat, 17 Dec 2022 21:44:03 GMT
- Title: Training Robots to Evaluate Robots: Example-Based Interactive Reward
Functions for Policy Learning
- Authors: Kun Huang, Edward S. Hu, Dinesh Jayaraman
- Abstract summary: We propose to train robots to acquire such interactive behaviors automatically.
These evaluations in turn serve as "interactive reward functions" (IRFs)
IRFs can be conveniently trained using only examples of successful outcomes.
- Score: 20.565163553170397
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Physical interactions can often help reveal information that is not readily
apparent. For example, we may tug at a table leg to evaluate whether it is
built well, or turn a water bottle upside down to check that it is watertight.
We propose to train robots to acquire such interactive behaviors automatically,
for the purpose of evaluating the result of an attempted robotic skill
execution. These evaluations in turn serve as "interactive reward functions"
(IRFs) for training reinforcement learning policies to perform the target
skill, such as screwing the table leg tightly. In addition, even after task
policies are fully trained, IRFs can serve as verification mechanisms that
improve online task execution. For any given task, our IRFs can be conveniently
trained using only examples of successful outcomes, and no further
specification is needed to train the task policy thereafter. In our evaluations
on door locking and weighted block stacking in simulation, and screw tightening
on a real robot, IRFs enable large performance improvements, even outperforming
baselines with access to demonstrations or carefully engineered rewards.
Project website: https://sites.google.com/view/lirf-corl-2022/
Related papers
- Real-World Reinforcement Learning of Active Perception Behaviors [27.56548234738969]
A robot's instantaneous sensory observations do not always reveal task-relevant state information.<n>We propose a simple real-world robot learning recipe to efficiently train active perception policies.<n>Our approach, asymmetric advantage weighted regression, exploits access to "privileged" extra sensors at training time.
arXiv Detail & Related papers (2025-12-01T02:05:20Z) - Solving Robotics Tasks with Prior Demonstration via Exploration-Efficient Deep Reinforcement Learning [0.688204255655161]
This paper proposes an exploration-efficient Deep Reinforcement Learning with Reference policy (DRLR) framework for learning robotics tasks that incorporates demonstrations.<n>The DRLR framework is developed based on an algorithm called Imitation Bootstrapped Reinforcement Learning (IBRL)
arXiv Detail & Related papers (2025-09-04T10:02:32Z) - FLaRe: Achieving Masterful and Adaptive Robot Policies with Large-Scale Reinforcement Learning Fine-Tuning [74.25049012472502]
FLaRe is a large-scale Reinforcement Learning framework that integrates robust pre-trained representations, large-scale training, and gradient stabilization techniques.
Our method aligns pre-trained policies towards task completion, achieving state-of-the-art (SoTA) performance on previously demonstrated and on entirely novel tasks and embodiments.
arXiv Detail & Related papers (2024-09-25T03:15:17Z) - Learning Diverse Robot Striking Motions with Diffusion Models and Kinematically Constrained Gradient Guidance [0.3613661942047476]
We develop a novel diffusion modeling approach that is offline, constraint-guided, and expressive of diverse agile behaviors.<n>We demonstrate the effectiveness of our approach for time-critical robotic tasks by evaluating KCGG in two challenging domains: simulated air hockey and real table tennis.
arXiv Detail & Related papers (2024-09-23T20:26:51Z) - Affordance-Guided Reinforcement Learning via Visual Prompting [51.361977466993345]
Keypoint-based Affordance Guidance for Improvements (KAGI) is a method leveraging rewards shaped by vision-language models (VLMs) for autonomous RL.
On real-world manipulation tasks specified by natural language descriptions, KAGI improves the sample efficiency of autonomous RL and enables successful task completion in 20K online fine-tuning steps.
arXiv Detail & Related papers (2024-07-14T21:41:29Z) - Robotic Control via Embodied Chain-of-Thought Reasoning [86.6680905262442]
Key limitation of learned robot control policies is their inability to generalize outside their training data.
Recent works on vision-language-action models (VLAs) have shown that the use of large, internet pre-trained vision-language models can substantially improve their robustness and generalization ability.
We introduce Embodied Chain-of-Thought Reasoning (ECoT) for VLAs, in which we train VLAs to perform multiple steps of reasoning about plans, sub-tasks, motions, and visually grounded features before predicting the robot action.
arXiv Detail & Related papers (2024-07-11T17:31:01Z) - Interactive Robot Learning from Verbal Correction [42.37176329867376]
OLAF allows users to teach a robot using verbal corrections when the robot makes mistakes.
A key feature of OLAF is its ability to update the robot's visuomotor neural policy based on the verbal feedback.
We demonstrate the efficacy of our design in experiments where a user teaches a robot to perform long-horizon manipulation tasks.
arXiv Detail & Related papers (2023-10-26T16:46:12Z) - Robot Fine-Tuning Made Easy: Pre-Training Rewards and Policies for
Autonomous Real-World Reinforcement Learning [58.3994826169858]
We introduce RoboFuME, a reset-free fine-tuning system for robotic reinforcement learning.
Our insights are to utilize offline reinforcement learning techniques to ensure efficient online fine-tuning of a pre-trained policy.
Our method can incorporate data from an existing robot dataset and improve on a target task within as little as 3 hours of autonomous real-world experience.
arXiv Detail & Related papers (2023-10-23T17:50:08Z) - Self-Improving Robots: End-to-End Autonomous Visuomotor Reinforcement
Learning [54.636562516974884]
In imitation and reinforcement learning, the cost of human supervision limits the amount of data that robots can be trained on.
In this work, we propose MEDAL++, a novel design for self-improving robotic systems.
The robot autonomously practices the task by learning to both do and undo the task, simultaneously inferring the reward function from the demonstrations.
arXiv Detail & Related papers (2023-03-02T18:51:38Z) - Variational Meta Reinforcement Learning for Social Robotics [15.754961709819938]
Social robotics still faces many challenges.
One bottleneck is that robotic behaviors need to be often adapted as social norms depend strongly on the environment.
In this work, we investigate meta-reinforcement learning (meta-RL) as a potential solution.
arXiv Detail & Related papers (2022-06-07T12:08:59Z) - A Framework for Efficient Robotic Manipulation [79.10407063260473]
We show that a single robotic arm can learn sparse-reward manipulation policies from pixels.
We show that, given only 10 demonstrations, a single robotic arm can learn sparse-reward manipulation policies from pixels.
arXiv Detail & Related papers (2020-12-14T22:18:39Z) - Reinforcement Learning Experiments and Benchmark for Solving Robotic
Reaching Tasks [0.0]
Reinforcement learning has been successfully applied to solving the reaching task with robotic arms.
It is shown that augmenting the reward signal with the Hindsight Experience Replay exploration technique increases the average return of off-policy agents.
arXiv Detail & Related papers (2020-11-11T14:00:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.