MarioGPT: Open-Ended Text2Level Generation through Large Language Models
- URL: http://arxiv.org/abs/2302.05981v3
- Date: Wed, 8 Nov 2023 09:43:51 GMT
- Title: MarioGPT: Open-Ended Text2Level Generation through Large Language Models
- Authors: Shyam Sudhakaran, Miguel Gonz\'alez-Duque, Claire Glanois, Matthias
Freiberger, Elias Najarro, Sebastian Risi
- Abstract summary: Procedural Content Generation (PCG) is a technique to generate complex and diverse environments in an automated way.
Here, we introduce MarioGPT, a fine-tuned GPT2 model trained to generate tile-based game levels.
- Score: 20.264940262622282
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Procedural Content Generation (PCG) is a technique to generate complex and
diverse environments in an automated way. However, while generating content
with PCG methods is often straightforward, generating meaningful content that
reflects specific intentions and constraints remains challenging. Furthermore,
many PCG algorithms lack the ability to generate content in an open-ended
manner. Recently, Large Language Models (LLMs) have shown to be incredibly
effective in many diverse domains. These trained LLMs can be fine-tuned,
re-using information and accelerating training for new tasks. Here, we
introduce MarioGPT, a fine-tuned GPT2 model trained to generate tile-based game
levels, in our case Super Mario Bros levels. MarioGPT can not only generate
diverse levels, but can be text-prompted for controllable level generation,
addressing one of the key challenges of current PCG techniques. As far as we
know, MarioGPT is the first text-to-level model and combined with novelty
search it enables the generation of diverse levels with varying play-style
dynamics (i.e. player paths) and the open-ended discovery of an increasingly
diverse range of content. Code available at
https://github.com/shyamsn97/mario-gpt.
Related papers
- CodeGRAG: Bridging the Gap between Natural Language and Programming Language via Graphical Retrieval Augmented Generation [58.84212778960507]
We propose CodeGRAG, a Graphical Retrieval Augmented Code Generation framework to enhance the performance of LLMs.
CodeGRAG builds the graphical view of code blocks based on the control flow and data flow of them to fill the gap between programming languages and natural language.
Various experiments and ablations are done on four datasets including both the C++ and python languages to validate the hard meta-graph prompt, the soft prompting technique, and the effectiveness of the objectives for pretrained GNN expert.
arXiv Detail & Related papers (2024-05-03T02:48:55Z) - Game Generation via Large Language Models [3.4051285393187327]
This paper investigates the game generation via large language models (LLMs)
Based on video game description language, this paper proposes an LLM-based framework to generate game rules and levels simultaneously.
arXiv Detail & Related papers (2024-04-11T10:06:05Z) - Kosmos-G: Generating Images in Context with Multimodal Large Language Models [117.0259361818715]
Current subject-driven image generation methods require test-time tuning and cannot accept interleaved multi-image and text input.
This paper presents Kosmos-G, a model that leverages the advanced multimodal perception capabilities of Multimodal Large Language Models.
Kosmos-G demonstrates an impressive capability of zero-shot subject-driven generation with interleaved multi-image and text input.
arXiv Detail & Related papers (2023-10-04T17:28:44Z) - ZeroGen: Zero-shot Multimodal Controllable Text Generation with Multiple
Oracles [29.460712493470453]
We propose a new paradigm of zero-shot controllable text generation with multimodal signals (textscZeroGen)
textscZeroGen leverages controls of text and image successively from token-level to sentence-level and maps them into a unified probability space at decoding.
We show that textscZeroGen not only outperforms its counterparts on captioning tasks by a large margin but also shows great potential in multimodal news generation with a higher degree of control.
arXiv Detail & Related papers (2023-06-29T03:22:43Z) - MGDoc: Pre-training with Multi-granular Hierarchy for Document Image
Understanding [53.03978356918377]
spatial hierarchical relationships between content at different levels of granularity are crucial for document image understanding tasks.
Existing methods learn features from either word-level or region-level but fail to consider both simultaneously.
We propose MGDoc, a new multi-modal multi-granular pre-training framework that encodes page-level, region-level, and word-level information at the same time.
arXiv Detail & Related papers (2022-11-27T22:47:37Z) - Learning to Transfer Prompts for Text Generation [97.64625999380425]
We propose a novel prompt-based method (PTG) for text generation in a transferable setting.
First, PTG learns a set of source prompts for various source generation tasks and then transfers these prompts as target prompts to perform target generation tasks.
In extensive experiments, PTG yields competitive or better results than fine-tuning methods.
arXiv Detail & Related papers (2022-05-03T14:53:48Z) - TegTok: Augmenting Text Generation via Task-specific and Open-world
Knowledge [83.55215993730326]
We propose augmenting TExt Generation via Task-specific and Open-world Knowledge (TegTok) in a unified framework.
Our model selects knowledge entries from two types of knowledge sources through dense retrieval and then injects them into the input encoding and output decoding stages respectively.
arXiv Detail & Related papers (2022-03-16T10:37:59Z) - Experience-Driven PCG via Reinforcement Learning: A Super Mario Bros
Study [2.2215852332444905]
The framework is tested initially in the Super Mario Bros game.
The correctness of the generation is ensured by a neural net-assisted evolutionary level repairer.
Our proposed framework is capable of generating endless, playable Super Mario Bros levels.
arXiv Detail & Related papers (2021-06-30T08:10:45Z) - GeDi: Generative Discriminator Guided Sequence Generation [53.15651536569169]
We propose GeDi as an efficient method for using smaller LMs as generative discriminators to guide generation from large LMs.
We find that GeDi gives stronger controllability than the state of the art method while also achieving generation speeds more than 30 times faster.
arXiv Detail & Related papers (2020-09-14T17:45:36Z) - TOAD-GAN: Coherent Style Level Generation from a Single Example [24.039037782220017]
We present TOAD-GAN, a novel Procedural Content Generation (PCG) algorithm that generates token-based video game levels.
We demonstrate its application for Super Mario Bros. levels and are able to generate new levels of similar style in arbitrary sizes.
arXiv Detail & Related papers (2020-08-04T13:44:50Z) - Capturing Local and Global Patterns in Procedural Content Generation via
Machine Learning [9.697217570243845]
Recent procedural content generation via machine learning (PCGML) methods allow learning to produce similar content from existing content.
It is an open questions how well these approaches can capture large-scale visual patterns such as symmetry.
In this paper, we propose to match-three games as a domain to test PCGML algorithms regarding their ability to generate suitable patterns.
arXiv Detail & Related papers (2020-05-26T08:58:37Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.