Related papers: Autoverse: An Evolvable Game Language for Learning Robust Embodied Agents

Autoverse: An Evolvable Game Language for Learning Robust Embodied Agents

URL: http://arxiv.org/abs/2407.04221v2
Date: Tue, 6 Aug 2024 09:39:14 GMT
Title: Autoverse: An Evolvable Game Language for Learning Robust Embodied Agents
Authors: Sam Earle, Julian Togelius,
Abstract summary: We introduce Autoverse, an evolvable, domain-specific language for single-player 2D grid-based games. We demonstrate its use as a scalable training ground for Open-Ended Learning (OEL) algorithms.
Score: 2.624282086797512
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We introduce Autoverse, an evolvable, domain-specific language for single-player 2D grid-based games, and demonstrate its use as a scalable training ground for Open-Ended Learning (OEL) algorithms. Autoverse uses cellular-automaton-like rewrite rules to describe game mechanics, allowing it to express various game environments (e.g. mazes, dungeons, sokoban puzzles) that are popular testbeds for Reinforcement Learning (RL) agents. Each rewrite rule can be expressed as a series of simple convolutions, allowing for environments to be parallelized on the GPU, thereby drastically accelerating RL training. Using Autoverse, we propose jump-starting open-ended learning by imitation learning from search. In such an approach, we first evolve Autoverse environments (their rules and initial map topology) to maximize the number of iterations required by greedy tree search to discover a new best solution, producing a curriculum of increasingly complex environments and playtraces. We then distill these expert playtraces into a neural-network-based policy using imitation learning. Finally, we use the learned policy as a starting point for open-ended RL, where new training environments are continually evolved to maximize the RL player agent's value function error (a proxy for its regret, or the learnability of generated environments), finding that this approach improves the performance and generality of resultant player agents.

Related papers

Accelerate Multi-Agent Reinforcement Learning in Zero-Sum Games with Subgame Curriculum Learning [65.36326734799587]
We present a novel subgame curriculum learning framework for zero-sum games. It adopts an adaptive initial state distribution by resetting agents to some previously visited states. We derive a subgame selection metric that approximates the squared distance to NE values.
arXiv Detail & Related papers (2023-10-07T13:09:37Z)
Mechanic Maker 2.0: Reinforcement Learning for Evaluating Generated Rules [5.9135869246353305]
We investigate the application of Reinforcement Learning as an approximator for human play for rule generation. We recreate the classic AGD environment Mechanic Maker in Unity as a new, open-source rule generation framework.
arXiv Detail & Related papers (2023-09-18T04:15:09Z)
Scaling Laws for Imitation Learning in Single-Agent Games [29.941613597833133]
We investigate whether carefully scaling up model and data size can bring similar improvements in the imitation learning setting for single-agent games. We first demonstrate our findings on a variety of Atari games, and thereafter focus on the extremely challenging game of NetHack. We find that IL loss and mean return scale smoothly with the compute budget and are strongly correlated, resulting in power laws for training compute-optimal IL agents.
arXiv Detail & Related papers (2023-07-18T16:43:03Z)
SPRING: Studying the Paper and Reasoning to Play Games [102.5587155284795]
We propose a novel approach, SPRING, to read the game's original academic paper and use the knowledge learned to reason and play the game through a large language model (LLM) In experiments, we study the quality of in-context "reasoning" induced by different forms of prompts under the setting of the Crafter open-world environment. Our experiments suggest that LLMs, when prompted with consistent chain-of-thought, have great potential in completing sophisticated high-level trajectories.
arXiv Detail & Related papers (2023-05-24T18:14:35Z)
Learning to Play Text-based Adventure Games with Maximum Entropy Reinforcement Learning [4.698846136465861]
We adapt the soft-actor-critic (SAC) algorithm to the text-based environment. We show that the reward shaping technique helps the agent to learn the policy faster and achieve higher scores.
arXiv Detail & Related papers (2023-02-21T15:16:12Z)
Discovering Multi-Agent Auto-Curricula in Two-Player Zero-Sum Games [31.97631243571394]
We introduce a framework, LMAC, that automates the discovery of the update rule without explicit human design. Surprisingly, even without human design, the discovered MARL algorithms achieve competitive or even better performance. We show that LMAC is able to generalise from small games to large games, for example training on Kuhn Poker and outperforming PSRO.
arXiv Detail & Related papers (2021-06-04T22:30:25Z)
Deep Policy Networks for NPC Behaviors that Adapt to Changing Design Parameters in Roguelike Games [137.86426963572214]
Turn-based strategy games like Roguelikes, for example, present unique challenges to Deep Reinforcement Learning (DRL) We propose two network architectures to better handle complex categorical state spaces and to mitigate the need for retraining forced by design decisions.
arXiv Detail & Related papers (2020-12-07T08:47:25Z)
DeepCrawl: Deep Reinforcement Learning for Turn-based Strategy Games [137.86426963572214]
We introduce DeepCrawl, a fully-playable Roguelike prototype for iOS and Android in which all agents are controlled by policy networks trained using Deep Reinforcement Learning (DRL) Our aim is to understand whether recent advances in DRL can be used to develop convincing behavioral models for non-player characters in videogames.
arXiv Detail & Related papers (2020-12-03T13:53:29Z)
Discovering Reinforcement Learning Algorithms [53.72358280495428]
Reinforcement learning algorithms update an agent's parameters according to one of several possible rules. This paper introduces a new meta-learning approach that discovers an entire update rule. It includes both 'what to predict' (e.g. value functions) and 'how to learn from it' by interacting with a set of environments.
arXiv Detail & Related papers (2020-07-17T07:38:39Z)
The NetHack Learning Environment [79.06395964379107]
We present the NetHack Learning Environment (NLE), a procedurally generated rogue-like environment for Reinforcement Learning research. We argue that NetHack is sufficiently complex to drive long-term research on problems such as exploration, planning, skill acquisition, and language-conditioned RL. We demonstrate empirical success for early stages of the game using a distributed Deep RL baseline and Random Network Distillation exploration.
arXiv Detail & Related papers (2020-06-24T14:12:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.