BeTAIL: Behavior Transformer Adversarial Imitation Learning from Human Racing Gameplay
- URL: http://arxiv.org/abs/2402.14194v2
- Date: Thu, 11 Jul 2024 16:50:08 GMT
- Title: BeTAIL: Behavior Transformer Adversarial Imitation Learning from Human Racing Gameplay
- Authors: Catherine Weaver, Chen Tang, Ce Hao, Kenta Kawamoto, Masayoshi Tomizuka, Wei Zhan,
- Abstract summary: Imitation learning learns a policy from demonstrations without requiring hand-designed reward functions.
We propose BeTAIL: Behavior Transformer Adversarial Imitation Learning.
We test BeTAIL on three challenges with expert-level demonstrations of real human gameplay in Gran Turismo Sport.
- Score: 48.75878234995544
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Imitation learning learns a policy from demonstrations without requiring hand-designed reward functions. In many robotic tasks, such as autonomous racing, imitated policies must model complex environment dynamics and human decision-making. Sequence modeling is highly effective in capturing intricate patterns of motion sequences but struggles to adapt to new environments or distribution shifts that are common in real-world robotics tasks. In contrast, Adversarial Imitation Learning (AIL) can mitigate this effect, but struggles with sample inefficiency and handling complex motion patterns. Thus, we propose BeTAIL: Behavior Transformer Adversarial Imitation Learning, which combines a Behavior Transformer (BeT) policy from human demonstrations with online AIL. BeTAIL adds an AIL residual policy to the BeT policy to model the sequential decision-making process of human experts and correct for out-of-distribution states or shifts in environment dynamics. We test BeTAIL on three challenges with expert-level demonstrations of real human gameplay in Gran Turismo Sport. Our proposed residual BeTAIL reduces environment interactions and improves racing performance and stability, even when the BeT is pretrained on different tracks than downstream learning. Videos and code available at: https://sites.google.com/berkeley.edu/BeTAIL/home.
Related papers
- VITAL: Visual Teleoperation to Enhance Robot Learning through Human-in-the-Loop Corrections [10.49712834719005]
We propose a low-cost visual teleoperation system for bimanual manipulation tasks, called VITAL.
Our approach leverages affordable hardware and visual processing techniques to collect demonstrations.
We enhance the generalizability and robustness of the learned policies by utilizing both real and simulated environments.
arXiv Detail & Related papers (2024-07-30T23:29:47Z) - Learning Variable Compliance Control From a Few Demonstrations for Bimanual Robot with Haptic Feedback Teleoperation System [5.497832119577795]
dexterous, contact-rich manipulation tasks using rigid robots is a significant challenge in robotics.
Compliance control schemes have been introduced to mitigate these issues by controlling forces via external sensors.
Learning from Demonstrations offers an intuitive alternative, allowing robots to learn manipulations through observed actions.
arXiv Detail & Related papers (2024-06-21T09:03:37Z) - Learning an Actionable Discrete Diffusion Policy via Large-Scale Actionless Video Pre-Training [69.54948297520612]
Learning a generalist embodied agent poses challenges, primarily stemming from the scarcity of action-labeled robotic datasets.
We introduce a novel framework to tackle these challenges, which leverages a unified discrete diffusion to combine generative pre-training on human videos and policy fine-tuning on a small number of action-labeled robot videos.
Our method generates high-fidelity future videos for planning and enhances the fine-tuned policies compared to previous state-of-the-art approaches.
arXiv Detail & Related papers (2024-02-22T09:48:47Z) - SWBT: Similarity Weighted Behavior Transformer with the Imperfect
Demonstration for Robotic Manipulation [32.78083518963342]
We propose a novel framework named Similarity Weighted Behavior Transformer (SWBT)
SWBT effectively learn from both expert and imperfect demonstrations without interaction with environments.
We are the first to attempt to integrate imperfect demonstrations into the offline imitation learning setting for robot manipulation tasks.
arXiv Detail & Related papers (2024-01-17T04:15:56Z) - Nonprehensile Planar Manipulation through Reinforcement Learning with
Multimodal Categorical Exploration [8.343657309038285]
Reinforcement Learning is a powerful framework for developing such robot controllers.
We propose a multimodal exploration approach through categorical distributions, which enables us to train planar pushing RL policies.
We show that the learned policies are robust to external disturbances and observation noise, and scale to tasks with multiple pushers.
arXiv Detail & Related papers (2023-08-04T16:55:00Z) - Imitating Task and Motion Planning with Visuomotor Transformers [71.41938181838124]
Task and Motion Planning (TAMP) can autonomously generate large-scale datasets of diverse demonstrations.
In this work, we show that the combination of large-scale datasets generated by TAMP supervisors and flexible Transformer models to fit them is a powerful paradigm for robot manipulation.
We present a novel imitation learning system called OPTIMUS that trains large-scale visuomotor Transformer policies by imitating a TAMP agent.
arXiv Detail & Related papers (2023-05-25T17:58:14Z) - OSCAR: Data-Driven Operational Space Control for Adaptive and Robust
Robot Manipulation [50.59541802645156]
Operational Space Control (OSC) has been used as an effective task-space controller for manipulation.
We propose OSC for Adaptation and Robustness (OSCAR), a data-driven variant of OSC that compensates for modeling errors.
We evaluate our method on a variety of simulated manipulation problems, and find substantial improvements over an array of controller baselines.
arXiv Detail & Related papers (2021-10-02T01:21:38Z) - Reinforcement Learning for Robust Parameterized Locomotion Control of
Bipedal Robots [121.42930679076574]
We present a model-free reinforcement learning framework for training robust locomotion policies in simulation.
domain randomization is used to encourage the policies to learn behaviors that are robust across variations in system dynamics.
We demonstrate this on versatile walking behaviors such as tracking a target walking velocity, walking height, and turning yaw.
arXiv Detail & Related papers (2021-03-26T07:14:01Z) - Modeling Human Driving Behavior through Generative Adversarial Imitation
Learning [7.387855463533219]
This article describes the use of Generative Adversarial Imitation Learning (GAIL) for learning-based driver modeling.
Because driver modeling is inherently a multi-agent problem, this paper describes a parameter-sharing extension of GAIL called PS-GAIL to tackle multi-agent driver modeling.
This paper describes Reward Augmented Imitation Learning (RAIL), which modifies the reward signal to provide domain-specific knowledge to the agent.
arXiv Detail & Related papers (2020-06-10T05:47:39Z) - Learning Agile Robotic Locomotion Skills by Imitating Animals [72.36395376558984]
Reproducing the diverse and agile locomotion skills of animals has been a longstanding challenge in robotics.
We present an imitation learning system that enables legged robots to learn agile locomotion skills by imitating real-world animals.
arXiv Detail & Related papers (2020-04-02T02:56:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.