Related papers: Scaling Laws for Imitation Learning in Single-Agent Games

Scaling Laws for Imitation Learning in Single-Agent Games

URL: http://arxiv.org/abs/2307.09423v2
Date: Sun, 10 Mar 2024 14:50:52 GMT
Title: Scaling Laws for Imitation Learning in Single-Agent Games
Authors: Jens Tuyls, Dhruv Madeka, Kari Torkkola, Dean Foster, Karthik Narasimhan, Sham Kakade
Abstract summary: We investigate whether carefully scaling up model and data size can bring similar improvements in the imitation learning setting for single-agent games. We first demonstrate our findings on a variety of Atari games, and thereafter focus on the extremely challenging game of NetHack. We find that IL loss and mean return scale smoothly with the compute budget and are strongly correlated, resulting in power laws for training compute-optimal IL agents.
Score: 29.941613597833133
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Imitation Learning (IL) is one of the most widely used methods in machine learning. Yet, many works find it is often unable to fully recover the underlying expert behavior, even in constrained environments like single-agent games. However, none of these works deeply investigate the role of scaling up the model and data size. Inspired by recent work in Natural Language Processing (NLP) where "scaling up" has resulted in increasingly more capable LLMs, we investigate whether carefully scaling up model and data size can bring similar improvements in the imitation learning setting for single-agent games. We first demonstrate our findings on a variety of Atari games, and thereafter focus on the extremely challenging game of NetHack. In all games, we find that IL loss and mean return scale smoothly with the compute budget (FLOPs) and are strongly correlated, resulting in power laws for training compute-optimal IL agents. Finally, we forecast and train several NetHack agents with IL and find they outperform prior state-of-the-art by 1.5x in all settings. Our work both demonstrates the scaling behavior of imitation learning in a variety of single-agent games, as well as the viability of scaling up current approaches for increasingly capable agents in NetHack, a game that remains elusively hard for current AI systems.

Related papers

Game-TARS: Pretrained Foundation Models for Scalable Generalist Multimodal Game Agents [56.25101378553328]
We present Game-TARS, a generalist game agent trained with a unified, scalable action space anchored to human-aligned keyboard-mouse inputs.<n>Game-TARS is pre-trained on over 500B tokens with diverse trajectories and multimodal data.<n> Experiments show that Game-TARS achieves about 2 times the success rate over the previous sota model on open-world Minecraft tasks.
arXiv Detail & Related papers (2025-10-27T17:43:51Z)
Look-ahead Reasoning with a Learned Model in Imperfect Information Games [3.4935179780034242]
This paper introduces an algorithm that learns an abstracted model of an imperfect information game directly from the agent-environment interaction.<n>During test time, this trained model is used to perform look-ahead reasoning.<n>We empirically demonstrate that with sufficient capacity, LAMIR learns the exact underlying game structure, and with limited capacity, it still learns a valuable abstraction.
arXiv Detail & Related papers (2025-10-06T17:26:56Z)
Autoverse: An Evolvable Game Language for Learning Robust Embodied Agents [2.624282086797512]
We introduce Autoverse, an evolvable, domain-specific language for single-player 2D grid-based games. We demonstrate its use as a scalable training ground for Open-Ended Learning (OEL) algorithms.
arXiv Detail & Related papers (2024-07-05T02:18:02Z)
Impact of Decentralized Learning on Player Utilities in Stackelberg Games [57.08270857260131]
In many two-agent systems, each agent learns separately and the rewards of the two agents are not perfectly aligned. We model these systems as Stackelberg games with decentralized learning and show that standard regret benchmarks result in worst-case linear regret for at least one player. We develop algorithms to achieve near-optimal $O(T2/3)$ regret for both players with respect to these benchmarks.
arXiv Detail & Related papers (2024-02-29T23:38:28Z)
Scaling Laws for a Multi-Agent Reinforcement Learning Model [0.0]
We study performance scaling for a cornerstone reinforcement learning algorithm, AlphaZero. We find that player strength scales as a power law in neural network parameter count when not bottlenecked by available compute. We find that the predicted scaling of optimal neural network size fits our data for both games.
arXiv Detail & Related papers (2022-09-29T19:08:51Z)
Multi-Game Decision Transformers [49.257185338595434]
We show that a single transformer-based model can play a suite of up to 46 Atari games simultaneously at close-to-human performance. We compare several approaches in this multi-game setting, such as online and offline RL methods and behavioral cloning. We find that our Multi-Game Decision Transformer models offer the best scalability and performance.
arXiv Detail & Related papers (2022-05-30T16:55:38Z)
TiKick: Toward Playing Multi-agent Football Full Games from Single-agent Demonstrations [31.596018856092513]
Tikick is the first learning-based AI system that can take over the multi-agent Google Research Football full game. To the best of our knowledge, Tikick is the first learning-based AI system that can take over the multi-agent Google Research Football full game.
arXiv Detail & Related papers (2021-10-09T08:34:58Z)
An Empirical Study on the Generalization Power of Neural Representations Learned via Visual Guessing Games [79.23847247132345]
This work investigates how well an artificial agent can benefit from playing guessing games when later asked to perform on novel NLP downstream tasks such as Visual Question Answering (VQA) We propose two ways to exploit playing guessing games: 1) a supervised learning scenario in which the agent learns to mimic successful guessing games and 2) a novel way for an agent to play by itself, called Self-play via Iterated Experience Learning (SPIEL)
arXiv Detail & Related papers (2021-01-31T10:30:48Z)
Deep Policy Networks for NPC Behaviors that Adapt to Changing Design Parameters in Roguelike Games [137.86426963572214]
Turn-based strategy games like Roguelikes, for example, present unique challenges to Deep Reinforcement Learning (DRL) We propose two network architectures to better handle complex categorical state spaces and to mitigate the need for retraining forced by design decisions.
arXiv Detail & Related papers (2020-12-07T08:47:25Z)
The NetHack Learning Environment [79.06395964379107]
We present the NetHack Learning Environment (NLE), a procedurally generated rogue-like environment for Reinforcement Learning research. We argue that NetHack is sufficiently complex to drive long-term research on problems such as exploration, planning, skill acquisition, and language-conditioned RL. We demonstrate empirical success for early stages of the game using a distributed Deep RL baseline and Random Network Distillation exploration.
arXiv Detail & Related papers (2020-06-24T14:12:56Z)
Algorithms in Multi-Agent Systems: A Holistic Perspective from Reinforcement Learning and Game Theory [2.5147566619221515]
Deep reinforcement learning has achieved outstanding results in recent years. Recent works are exploring learning beyond single-agent scenarios and considering multi-agent scenarios. Traditional game-theoretic algorithms, which, in turn, show bright application promise combined with modern algorithms and boosting computing power.
arXiv Detail & Related papers (2020-01-17T15:08:04Z)
Model-Based Reinforcement Learning for Atari [89.3039240303797]
We show how video prediction models can enable agents to solve Atari games with fewer interactions than model-free methods. Our experiments evaluate SimPLe on a range of Atari games in low data regime of 100k interactions between the agent and the environment.
arXiv Detail & Related papers (2019-03-01T15:40:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.