Related papers: Learn to Teach: Improve Sample Efficiency in Teacher-student Learning for Sim-to-Real Transfer

Learn to Teach: Improve Sample Efficiency in Teacher-student Learning for Sim-to-Real Transfer

URL: http://arxiv.org/abs/2402.06783v1
Date: Fri, 9 Feb 2024 21:16:43 GMT
Title: Learn to Teach: Improve Sample Efficiency in Teacher-student Learning for Sim-to-Real Transfer
Authors: Feiyang Wu, Zhaoyuan Gu, Ye Zhao, Anqi Wu
Abstract summary: We propose a sample efficient learning framework termed Learn to Teach (L2T) that recycles experience collected by the teacher agent. We show that a single-loop algorithm can train both the teacher and student agents under both Reinforcement Learning and Inverse Reinforcement Learning contexts.
Score: 5.731477362725785
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Simulation-to-reality (sim-to-real) transfer is a fundamental problem for robot learning. Domain Randomization, which adds randomization during training, is a powerful technique that effectively addresses the sim-to-real gap. However, the noise in observations makes learning significantly harder. Recently, studies have shown that employing a teacher-student learning paradigm can accelerate training in randomized environments. Learned with privileged information, a teacher agent can instruct the student agent to operate in noisy environments. However, this approach is often not sample efficient as the experience collected by the teacher is discarded completely when training the student, wasting information revealed by the environment. In this work, we extend the teacher-student learning paradigm by proposing a sample efficient learning framework termed Learn to Teach (L2T) that recycles experience collected by the teacher agent. We observe that the dynamics of the environments for both agents remain unchanged, and the state space of the teacher is coupled with the observation space of the student. We show that a single-loop algorithm can train both the teacher and student agents under both Reinforcement Learning and Inverse Reinforcement Learning contexts. We implement variants of our methods, conduct experiments on the MuJoCo benchmark, and apply our methods to the Cassie robot locomotion problem. Extensive experiments show that our method achieves competitive performance while only requiring environmental interaction with the teacher.

Related papers

CTSAC: Curriculum-Based Transformer Soft Actor-Critic for Goal-Oriented Robot Exploration [3.435901586870572]
Reinforcement Learning (RL) is becoming a promising approach in the field of autonomous robotic exploration. Current RL-based exploration algorithms often face limited environmental reasoning capabilities, slow convergence rates, and substantial challenges in Sim-To-Real transfer. We propose a Curriculum Learning-based Transformer Reinforcement Learning Algorithm (CTSAC) aimed at improving both exploration efficiency and transfer performance.
arXiv Detail & Related papers (2025-03-18T13:44:29Z)
Adaptive Teaching in Heterogeneous Agents: Balancing Surprise in Sparse Reward Scenarios [3.638198517970729]
Learning from Demonstration can be an efficient way to train systems with analogous agents. However, naively replicating demonstrations that are out of bounds for the Student's capability can limit efficient learning. We present a Teacher-Student learning framework specifically tailored to address the challenge of heterogeneity between the Teacher and Student agents.
arXiv Detail & Related papers (2024-05-23T05:52:42Z)
YODA: Teacher-Student Progressive Learning for Language Models [82.0172215948963]
This paper introduces YODA, a teacher-student progressive learning framework. It emulates the teacher-student education process to improve the efficacy of model fine-tuning. Experiments show that training LLaMA2 with data from YODA improves SFT with significant performance gain.
arXiv Detail & Related papers (2024-01-28T14:32:15Z)
Back-stepping Experience Replay with Application to Model-free Reinforcement Learning for a Soft Snake Robot [15.005962159112002]
Back-stepping Experience Replay (BER) is compatible with arbitrary off-policy reinforcement learning algorithms. We present an application of BER in a model-free RL approach for the locomotion and navigation of a soft snake robot.
arXiv Detail & Related papers (2024-01-21T02:17:16Z)
Model-Based Reinforcement Learning with Multi-Task Offline Pretraining [59.82457030180094]
We present a model-based RL method that learns to transfer potentially useful dynamics and action demonstrations from offline data to a novel task. The main idea is to use the world models not only as simulators for behavior learning but also as tools to measure the task relevance. We demonstrate the advantages of our approach compared with the state-of-the-art methods in Meta-World and DeepMind Control Suite.
arXiv Detail & Related papers (2023-06-06T02:24:41Z)
Hindsight States: Blending Sim and Real Task Elements for Efficient Reinforcement Learning [61.3506230781327]
In robotics, one approach to generate training data builds on simulations based on dynamics models derived from first principles. Here, we leverage the imbalance in complexity of the dynamics to learn more sample-efficiently. We validate our method on several challenging simulated tasks and demonstrate that it improves learning both alone and when combined with an existing hindsight algorithm.
arXiv Detail & Related papers (2023-03-03T21:55:04Z)
Distantly-Supervised Named Entity Recognition with Adaptive Teacher Learning and Fine-grained Student Ensemble [56.705249154629264]
Self-training teacher-student frameworks are proposed to improve the robustness of NER models. In this paper, we propose an adaptive teacher learning comprised of two teacher-student networks. Fine-grained student ensemble updates each fragment of the teacher model with a temporal moving average of the corresponding fragment of the student, which enhances consistent predictions on each model fragment against noise.
arXiv Detail & Related papers (2022-12-13T12:14:09Z)
Continual learning autoencoder training for a particle-in-cell simulation via streaming [52.77024349608834]
upcoming exascale era will provide a new generation of physics simulations with high resolution. These simulations will have a high resolution, which will impact the training of machine learning models since storing a high amount of simulation data on disk is nearly impossible. This work presents an approach that trains a neural network concurrently to a running simulation without data on a disk.
arXiv Detail & Related papers (2022-11-09T09:55:14Z)
Teacher-student curriculum learning for reinforcement learning [1.7259824817932292]
Reinforcement learning (rl) is a popular paradigm for sequential decision making problems. The sample inefficiency of deep reinforcement learning methods is a significant obstacle when applying rl to real-world problems. We propose a teacher-student curriculum learning setting where we simultaneously train a teacher that selects tasks for the student while the student learns how to solve the selected task.
arXiv Detail & Related papers (2022-10-31T14:45:39Z)
SAM-RL: Sensing-Aware Model-Based Reinforcement Learning via Differentiable Physics-Based Simulation and Rendering [49.78647219715034]
We propose a sensing-aware model-based reinforcement learning system called SAM-RL. With the sensing-aware learning pipeline, SAM-RL allows a robot to select an informative viewpoint to monitor the task process. We apply our framework to real world experiments for accomplishing three manipulation tasks: robotic assembly, tool manipulation, and deformable object manipulation.
arXiv Detail & Related papers (2022-10-27T05:30:43Z)
Know Thy Student: Interactive Learning with Gaussian Processes [11.641731210416102]
Our work proposes a simple diagnosis algorithm which uses Gaussian processes for inferring student-related information, before constructing a teaching dataset. We study this in the offline reinforcement learning setting where the teacher must provide demonstrations to the student and avoid sending redundant trajectories. Our experiments highlight the importance of diagosing before teaching and demonstrate how students can learn more efficiently with the help of an interactive teacher.
arXiv Detail & Related papers (2022-04-26T04:43:57Z)
TRAIL: Near-Optimal Imitation Learning with Suboptimal Data [100.83688818427915]
We present training objectives that use offline datasets to learn a factored transition model. Our theoretical analysis shows that the learned latent action space can boost the sample-efficiency of downstream imitation learning. To learn the latent action space in practice, we propose TRAIL (Transition-Reparametrized Actions for Imitation Learning), an algorithm that learns an energy-based transition model.
arXiv Detail & Related papers (2021-10-27T21:05:00Z)
Iterative Teacher-Aware Learning [136.05341445369265]
In human pedagogy, teachers and students can interact adaptively to maximize communication efficiency. We propose a gradient optimization based teacher-aware learner who can incorporate teacher's cooperative intention into the likelihood function.
arXiv Detail & Related papers (2021-10-01T00:27:47Z)
Efficient Transformers in Reinforcement Learning using Actor-Learner Distillation [91.05073136215886]
"Actor-Learner Distillation" transfers learning progress from a large capacity learner model to a small capacity actor model. We demonstrate in several challenging memory environments that using Actor-Learner Distillation recovers the clear sample-efficiency gains of the transformer learner model.
arXiv Detail & Related papers (2021-04-04T17:56:34Z)
Interaction-limited Inverse Reinforcement Learning [50.201765937436654]
We present two different training strategies: Curriculum Inverse Reinforcement Learning (CIRL) covering the teacher's perspective, and Self-Paced Inverse Reinforcement Learning (SPIRL) focusing on the learner's perspective. Using experiments in simulations and experiments with a real robot learning a task from a human demonstrator, we show that our training strategies can allow a faster training than a random teacher for CIRL and than a batch learner for SPIRL.
arXiv Detail & Related papers (2020-07-01T12:31:52Z)
RL-CycleGAN: Reinforcement Learning Aware Simulation-To-Real [74.45688231140689]
We introduce the RL-scene consistency loss for image translation, which ensures that the translation operation is invariant with respect to the Q-values associated with the image. We obtain RL-CycleGAN, a new approach for simulation-to-real-world transfer for reinforcement learning.
arXiv Detail & Related papers (2020-06-16T08:58:07Z)
Never Stop Learning: The Effectiveness of Fine-Tuning in Robotic Reinforcement Learning [109.77163932886413]
We show how to adapt vision-based robotic manipulation policies to new variations by fine-tuning via off-policy reinforcement learning. This adaptation uses less than 0.2% of the data necessary to learn the task from scratch. We find that our approach of adapting pre-trained policies leads to substantial performance gains over the course of fine-tuning.
arXiv Detail & Related papers (2020-04-21T17:57:04Z)
ACNMP: Skill Transfer and Task Extrapolation through Learning from Demonstration and Reinforcement Learning via Representation Sharing [5.06461227260756]
ACNMPs can be used to implement skill transfer between robots having different morphology. We show the real-world suitability of ACNMPs through real robot experiments.
arXiv Detail & Related papers (2020-03-25T11:28:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.