Benchmarking End-to-End Behavioural Cloning on Video Games
- URL: http://arxiv.org/abs/2004.00981v2
- Date: Mon, 18 May 2020 13:50:11 GMT
- Title: Benchmarking End-to-End Behavioural Cloning on Video Games
- Authors: Anssi Kanervisto, Joonas Pussinen, Ville Hautam\"aki
- Abstract summary: We study the general applicability of behavioural cloning on twelve video games, including six modern video games (published after 2010)
Our results show that these agents cannot match humans in raw performance but do learn basic dynamics and rules.
We also demonstrate how the quality of the data matters, and how recording data from humans is subject to a state-action mismatch, due to human reflexes.
- Score: 5.863352129133669
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Behavioural cloning, where a computer is taught to perform a task based on
demonstrations, has been successfully applied to various video games and
robotics tasks, with and without reinforcement learning. This also includes
end-to-end approaches, where a computer plays a video game like humans do: by
looking at the image displayed on the screen, and sending keystrokes to the
game. As a general approach to playing video games, this has many inviting
properties: no need for specialized modifications to the game, no lengthy
training sessions and the ability to re-use the same tools across different
games. However, related work includes game-specific engineering to achieve the
results. We take a step towards a general approach and study the general
applicability of behavioural cloning on twelve video games, including six
modern video games (published after 2010), by using human demonstrations as
training data. Our results show that these agents cannot match humans in raw
performance but do learn basic dynamics and rules. We also demonstrate how the
quality of the data matters, and how recording data from humans is subject to a
state-action mismatch, due to human reflexes.
Related papers
- HumanPlus: Humanoid Shadowing and Imitation from Humans [82.47551890765202]
We introduce a full-stack system for humanoids to learn motion and autonomous skills from human data.
We first train a low-level policy in simulation via reinforcement learning using existing 40-hour human motion datasets.
We then perform supervised behavior cloning to train skill policies using egocentric vision, allowing humanoids to complete different tasks autonomously.
arXiv Detail & Related papers (2024-06-15T00:41:34Z) - Large-Scale Actionless Video Pre-Training via Discrete Diffusion for
Efficient Policy Learning [73.69573252516761]
We introduce a novel framework that combines generative pre-training on human videos and policy fine-tuning on action-labeled robot videos.
Our method generates high-fidelity future videos for planning and enhances the fine-tuned policies compared to previous state-of-the-art approaches.
arXiv Detail & Related papers (2024-02-22T09:48:47Z) - Behavioural Cloning in VizDoom [1.4999444543328293]
This paper describes methods for training autonomous agents to play the game "Doom 2" through Imitation Learning (IL)
We also explore how Reinforcement Learning (RL) compares to IL for humanness by comparing camera movement and trajectory data.
arXiv Detail & Related papers (2024-01-08T16:15:43Z) - Learning Reward Functions for Robotic Manipulation by Observing Humans [92.30657414416527]
We use unlabeled videos of humans solving a wide range of manipulation tasks to learn a task-agnostic reward function for robotic manipulation policies.
The learned rewards are based on distances to a goal in an embedding space learned using a time-contrastive objective.
arXiv Detail & Related papers (2022-11-16T16:26:48Z) - Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online
Videos [16.858980871368175]
We extend the internet-scale pretraining paradigm to sequential decision domains through semi-trivial imitation learning.
We show that this behavioral prior has non zero-shot capabilities and that it can be fine-tuned, with both imitation learning and reinforcement learning.
For many tasks our models exhibit human-level performance, and we are the first to report computer agents that can craft diamond tools.
arXiv Detail & Related papers (2022-06-23T16:01:11Z) - Playing for 3D Human Recovery [74.01259933358331]
In this work, we obtain massive human sequences as well as their 3D ground truths by playing video games.
Specifically, we contribute, GTA-Human, a mega-scale and highly-diverse 3D human dataset generated with the GTA-V game engine.
With a rich set of subjects, actions, and scenarios, GTA-Human serves as both an effective training source.
arXiv Detail & Related papers (2021-10-14T17:49:42Z) - Playful Interactions for Representation Learning [82.59215739257104]
We propose to use playful interactions in a self-supervised manner to learn visual representations for downstream tasks.
We collect 2 hours of playful data in 19 diverse environments and use self-predictive learning to extract visual representations.
Our representations generalize better than standard behavior cloning and can achieve similar performance with only half the number of required demonstrations.
arXiv Detail & Related papers (2021-07-19T17:54:48Z) - Visual Imitation Made Easy [102.36509665008732]
We present an alternate interface for imitation that simplifies the data collection process while allowing for easy transfer to robots.
We use commercially available reacher-grabber assistive tools both as a data collection device and as the robot's end-effector.
We experimentally evaluate on two challenging tasks: non-prehensile pushing and prehensile stacking, with 1000 diverse demonstrations for each task.
arXiv Detail & Related papers (2020-08-11T17:58:50Z) - Learning to Play by Imitating Humans [8.209859328381269]
We show that it is possible to acquire a diverse set of skills by self-supervising control on top of human teleoperated play data.
By training a behavioral cloning policy on a relatively small quantity of human play, we autonomously generate a large quantity of cloned play data.
We demonstrate that a general purpose goal-conditioned policy trained on this augmented dataset substantially outperforms one trained only with the original human data.
arXiv Detail & Related papers (2020-06-11T23:28:54Z) - Navigating the Landscape of Multiplayer Games [20.483315340460127]
We show how network measures applied to response graphs of large-scale games enable the creation of a landscape of games.
We illustrate our findings in domains ranging from canonical games to complex empirical games capturing the performance of trained agents pitted against one another.
arXiv Detail & Related papers (2020-05-04T16:58:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.