BeTAIL: Behavior Transformer Adversarial Imitation Learning from Human Racing Gameplay
- URL: http://arxiv.org/abs/2402.14194v2
- Date: Thu, 11 Jul 2024 16:50:08 GMT
- Title: BeTAIL: Behavior Transformer Adversarial Imitation Learning from Human Racing Gameplay
- Authors: Catherine Weaver, Chen Tang, Ce Hao, Kenta Kawamoto, Masayoshi Tomizuka, Wei Zhan,
- Abstract summary: Imitation learning learns a policy from demonstrations without requiring hand-designed reward functions.
We propose BeTAIL: Behavior Transformer Adversarial Imitation Learning.
We test BeTAIL on three challenges with expert-level demonstrations of real human gameplay in Gran Turismo Sport.
- Score: 48.75878234995544
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Imitation learning learns a policy from demonstrations without requiring hand-designed reward functions. In many robotic tasks, such as autonomous racing, imitated policies must model complex environment dynamics and human decision-making. Sequence modeling is highly effective in capturing intricate patterns of motion sequences but struggles to adapt to new environments or distribution shifts that are common in real-world robotics tasks. In contrast, Adversarial Imitation Learning (AIL) can mitigate this effect, but struggles with sample inefficiency and handling complex motion patterns. Thus, we propose BeTAIL: Behavior Transformer Adversarial Imitation Learning, which combines a Behavior Transformer (BeT) policy from human demonstrations with online AIL. BeTAIL adds an AIL residual policy to the BeT policy to model the sequential decision-making process of human experts and correct for out-of-distribution states or shifts in environment dynamics. We test BeTAIL on three challenges with expert-level demonstrations of real human gameplay in Gran Turismo Sport. Our proposed residual BeTAIL reduces environment interactions and improves racing performance and stability, even when the BeT is pretrained on different tracks than downstream learning. Videos and code available at: https://sites.google.com/berkeley.edu/BeTAIL/home.
Related papers
- Large-Scale Actionless Video Pre-Training via Discrete Diffusion for
Efficient Policy Learning [73.69573252516761]
We introduce a novel framework that combines generative pre-training on human videos and policy fine-tuning on action-labeled robot videos.
Our method generates high-fidelity future videos for planning and enhances the fine-tuned policies compared to previous state-of-the-art approaches.
arXiv Detail & Related papers (2024-02-22T09:48:47Z) - SWBT: Similarity Weighted Behavior Transformer with the Imperfect
Demonstration for Robotic Manipulation [32.78083518963342]
We propose a novel framework named Similarity Weighted Behavior Transformer (SWBT)
SWBT effectively learn from both expert and imperfect demonstrations without interaction with environments.
We are the first to attempt to integrate imperfect demonstrations into the offline imitation learning setting for robot manipulation tasks.
arXiv Detail & Related papers (2024-01-17T04:15:56Z) - Infer and Adapt: Bipedal Locomotion Reward Learning from Demonstrations
via Inverse Reinforcement Learning [5.246548532908499]
This paper brings state-of-the-art Inverse Reinforcement Learning (IRL) techniques to solving bipedal locomotion problems over complex terrains.
We propose algorithms for learning expert reward functions, and we subsequently analyze the learned functions.
We empirically demonstrate that training a bipedal locomotion policy with the inferred reward functions enhances its walking performance on unseen terrains.
arXiv Detail & Related papers (2023-09-28T00:11:06Z) - Nonprehensile Planar Manipulation through Reinforcement Learning with
Multimodal Categorical Exploration [8.343657309038285]
Reinforcement Learning is a powerful framework for developing such robot controllers.
We propose a multimodal exploration approach through categorical distributions, which enables us to train planar pushing RL policies.
We show that the learned policies are robust to external disturbances and observation noise, and scale to tasks with multiple pushers.
arXiv Detail & Related papers (2023-08-04T16:55:00Z) - Imitating Task and Motion Planning with Visuomotor Transformers [71.41938181838124]
Task and Motion Planning (TAMP) can autonomously generate large-scale datasets of diverse demonstrations.
In this work, we show that the combination of large-scale datasets generated by TAMP supervisors and flexible Transformer models to fit them is a powerful paradigm for robot manipulation.
We present a novel imitation learning system called OPTIMUS that trains large-scale visuomotor Transformer policies by imitating a TAMP agent.
arXiv Detail & Related papers (2023-05-25T17:58:14Z) - OSCAR: Data-Driven Operational Space Control for Adaptive and Robust
Robot Manipulation [50.59541802645156]
Operational Space Control (OSC) has been used as an effective task-space controller for manipulation.
We propose OSC for Adaptation and Robustness (OSCAR), a data-driven variant of OSC that compensates for modeling errors.
We evaluate our method on a variety of simulated manipulation problems, and find substantial improvements over an array of controller baselines.
arXiv Detail & Related papers (2021-10-02T01:21:38Z) - Learning Bipedal Robot Locomotion from Human Movement [0.791553652441325]
We present a reinforcement learning based method for teaching a real world bipedal robot to perform movements directly from motion capture data.
Our method seamlessly transitions from training in a simulation environment to executing on a physical robot.
We demonstrate our method on an internally developed humanoid robot with movements ranging from a dynamic walk cycle to complex balancing and waving.
arXiv Detail & Related papers (2021-05-26T00:49:37Z) - Reinforcement Learning for Robust Parameterized Locomotion Control of
Bipedal Robots [121.42930679076574]
We present a model-free reinforcement learning framework for training robust locomotion policies in simulation.
domain randomization is used to encourage the policies to learn behaviors that are robust across variations in system dynamics.
We demonstrate this on versatile walking behaviors such as tracking a target walking velocity, walking height, and turning yaw.
arXiv Detail & Related papers (2021-03-26T07:14:01Z) - Modeling Human Driving Behavior through Generative Adversarial Imitation
Learning [7.387855463533219]
This article describes the use of Generative Adversarial Imitation Learning (GAIL) for learning-based driver modeling.
Because driver modeling is inherently a multi-agent problem, this paper describes a parameter-sharing extension of GAIL called PS-GAIL to tackle multi-agent driver modeling.
This paper describes Reward Augmented Imitation Learning (RAIL), which modifies the reward signal to provide domain-specific knowledge to the agent.
arXiv Detail & Related papers (2020-06-10T05:47:39Z) - Learning Agile Robotic Locomotion Skills by Imitating Animals [72.36395376558984]
Reproducing the diverse and agile locomotion skills of animals has been a longstanding challenge in robotics.
We present an imitation learning system that enables legged robots to learn agile locomotion skills by imitating real-world animals.
arXiv Detail & Related papers (2020-04-02T02:56:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.