NetHack is Hard to Hack
- URL: http://arxiv.org/abs/2305.19240v2
- Date: Mon, 30 Oct 2023 14:23:33 GMT
- Title: NetHack is Hard to Hack
- Authors: Ulyana Piterbarg, Lerrel Pinto, Rob Fergus
- Abstract summary: In the NeurIPS 2021 NetHack Challenge, symbolic agents outperformed neural approaches by over four times in median game score.
We present an extensive study on neural policy learning for NetHack.
We produce a state-of-the-art neural agent that surpasses previous fully neural policies by 127% in offline settings and 25% in online settings on median game score.
- Score: 37.24009814390211
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural policy learning methods have achieved remarkable results in various
control problems, ranging from Atari games to simulated locomotion. However,
these methods struggle in long-horizon tasks, especially in open-ended
environments with multi-modal observations, such as the popular dungeon-crawler
game, NetHack. Intriguingly, the NeurIPS 2021 NetHack Challenge revealed that
symbolic agents outperformed neural approaches by over four times in median
game score. In this paper, we delve into the reasons behind this performance
gap and present an extensive study on neural policy learning for NetHack. To
conduct this study, we analyze the winning symbolic agent, extending its
codebase to track internal strategy selection in order to generate one of the
largest available demonstration datasets. Utilizing this dataset, we examine
(i) the advantages of an action hierarchy; (ii) enhancements in neural
architecture; and (iii) the integration of reinforcement learning with
imitation learning. Our investigations produce a state-of-the-art neural agent
that surpasses previous fully neural policies by 127% in offline settings and
25% in online settings on median game score. However, we also demonstrate that
mere scaling is insufficient to bridge the performance gap with the best
symbolic models or even the top human players.
Related papers
- Scaling Laws for Imitation Learning in Single-Agent Games [29.941613597833133]
We investigate whether carefully scaling up model and data size can bring similar improvements in the imitation learning setting for single-agent games.
We first demonstrate our findings on a variety of Atari games, and thereafter focus on the extremely challenging game of NetHack.
We find that IL loss and mean return scale smoothly with the compute budget and are strongly correlated, resulting in power laws for training compute-optimal IL agents.
arXiv Detail & Related papers (2023-07-18T16:43:03Z) - LuckyMera: a Modular AI Framework for Building Hybrid NetHack Agents [7.23273667916516]
Roguelike video games offer a good trade-off in terms of complexity of the environment and computational costs.
We present LuckyMera, a flexible, modular, generalization and AI framework built around NetHack.
LuckyMera comes with a set of off-the-shelf symbolic and neural modules (called "skills"): these modules can be either hard-coded behaviors, or neural Reinforcement Learning approaches.
arXiv Detail & Related papers (2023-07-17T14:46:59Z) - SpawnNet: Learning Generalizable Visuomotor Skills from Pre-trained
Networks [52.766795949716986]
We present a study of the generalization capabilities of the pre-trained visual representations at the categorical level.
We propose SpawnNet, a novel two-stream architecture that learns to fuse pre-trained multi-layer representations into a separate network to learn a robust policy.
arXiv Detail & Related papers (2023-07-07T13:01:29Z) - Backdoor Attack Detection in Computer Vision by Applying Matrix
Factorization on the Weights of Deep Networks [6.44397009982949]
We introduce a novel method for backdoor detection that extracts features from pre-trained DNN's weights.
In comparison to other detection techniques, this has a number of benefits, such as not requiring any training data.
Our method outperforms the competing algorithms in terms of efficiency and is more accurate, helping to ensure the safe application of deep learning and AI.
arXiv Detail & Related papers (2022-12-15T20:20:18Z) - Insights From the NeurIPS 2021 NetHack Challenge [40.52602443114554]
The first NeurIPS 2021 NetHack Challenge showcased community-driven progress in AI.
It served as a direct comparison between neural (e.g., deep RL) and symbolic AI, as well as hybrid systems.
No agent got close to winning the game, illustrating NetHack's suitability as a long-term benchmark for AI research.
arXiv Detail & Related papers (2022-03-22T17:01:07Z) - Gone Fishing: Neural Active Learning with Fisher Embeddings [55.08537975896764]
There is an increasing need for active learning algorithms that are compatible with deep neural networks.
This article introduces BAIT, a practical representation of tractable, and high-performing active learning algorithm for neural networks.
arXiv Detail & Related papers (2021-06-17T17:26:31Z) - Deep Policy Networks for NPC Behaviors that Adapt to Changing Design
Parameters in Roguelike Games [137.86426963572214]
Turn-based strategy games like Roguelikes, for example, present unique challenges to Deep Reinforcement Learning (DRL)
We propose two network architectures to better handle complex categorical state spaces and to mitigate the need for retraining forced by design decisions.
arXiv Detail & Related papers (2020-12-07T08:47:25Z) - DeepCrawl: Deep Reinforcement Learning for Turn-based Strategy Games [137.86426963572214]
We introduce DeepCrawl, a fully-playable Roguelike prototype for iOS and Android in which all agents are controlled by policy networks trained using Deep Reinforcement Learning (DRL)
Our aim is to understand whether recent advances in DRL can be used to develop convincing behavioral models for non-player characters in videogames.
arXiv Detail & Related papers (2020-12-03T13:53:29Z) - The NetHack Learning Environment [79.06395964379107]
We present the NetHack Learning Environment (NLE), a procedurally generated rogue-like environment for Reinforcement Learning research.
We argue that NetHack is sufficiently complex to drive long-term research on problems such as exploration, planning, skill acquisition, and language-conditioned RL.
We demonstrate empirical success for early stages of the game using a distributed Deep RL baseline and Random Network Distillation exploration.
arXiv Detail & Related papers (2020-06-24T14:12:56Z) - FRESH: Interactive Reward Shaping in High-Dimensional State Spaces using
Human Feedback [9.548547582558662]
Reinforcement learning has been successful in training autonomous agents to accomplish goals in complex environments.
Human players often find it easier to obtain higher rewards in some environments than reinforcement learning algorithms.
This is especially true of high-dimensional state spaces where the reward obtained by the agent is sparse or extremely delayed.
arXiv Detail & Related papers (2020-01-19T06:07:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.