Related papers: MarioGPT: Open-Ended Text2Level Generation through Large Language Models

MarioGPT: Open-Ended Text2Level Generation through Large Language Models

URL: http://arxiv.org/abs/2302.05981v3
Date: Wed, 8 Nov 2023 09:43:51 GMT
Title: MarioGPT: Open-Ended Text2Level Generation through Large Language Models
Authors: Shyam Sudhakaran, Miguel Gonz\'alez-Duque, Claire Glanois, Matthias Freiberger, Elias Najarro, Sebastian Risi
Abstract summary: Procedural Content Generation (PCG) is a technique to generate complex and diverse environments in an automated way. Here, we introduce MarioGPT, a fine-tuned GPT2 model trained to generate tile-based game levels.
Score: 20.264940262622282
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Procedural Content Generation (PCG) is a technique to generate complex and diverse environments in an automated way. However, while generating content with PCG methods is often straightforward, generating meaningful content that reflects specific intentions and constraints remains challenging. Furthermore, many PCG algorithms lack the ability to generate content in an open-ended manner. Recently, Large Language Models (LLMs) have shown to be incredibly effective in many diverse domains. These trained LLMs can be fine-tuned, re-using information and accelerating training for new tasks. Here, we introduce MarioGPT, a fine-tuned GPT2 model trained to generate tile-based game levels, in our case Super Mario Bros levels. MarioGPT can not only generate diverse levels, but can be text-prompted for controllable level generation, addressing one of the key challenges of current PCG techniques. As far as we know, MarioGPT is the first text-to-level model and combined with novelty search it enables the generation of diverse levels with varying play-style dynamics (i.e. player paths) and the open-ended discovery of an increasingly diverse range of content. Code available at https://github.com/shyamsn97/mario-gpt.

Related papers

Word2Minecraft: Generating 3D Game Levels through Large Language Models [6.037493811943889]
We present Word2Minecraft, a system that generates playable game levels in Minecraft based on structured stories. We introduce a flexible framework that allows for the customization of story complexity, enabling dynamic level generation. We show that GPT-4-Turbo outperforms GPT-4o-Mini in most areas, including story coherence and objective enjoyment.
arXiv Detail & Related papers (2025-03-18T18:38:38Z)
Can Graph Neural Networks Learn Language with Extremely Weak Text Supervision? [62.12375949429938]
Building transferable Graph Neural Networks (GNNs) with CLIP pipeline is challenging because of three fundamental issues. We leverage multi-modal prompt learning to effectively adapt pre-trained GNN to downstream tasks and data. Our new paradigm embeds the graphs directly in the same space as the Large Language Models (LLMs) by learning both graph prompts and text prompts simultaneously.
arXiv Detail & Related papers (2024-12-11T08:03:35Z)
CodeGRAG: Bridging the Gap between Natural Language and Programming Language via Graphical Retrieval Augmented Generation [58.84212778960507]
We propose CodeGRAG, a Graphical Retrieval Augmented Code Generation framework to enhance the performance of LLMs. CodeGRAG builds the graphical view of code blocks based on the control flow and data flow of them to fill the gap between programming languages and natural language. Various experiments and ablations are done on four datasets including both the C++ and python languages to validate the hard meta-graph prompt, the soft prompting technique, and the effectiveness of the objectives for pretrained GNN expert.
arXiv Detail & Related papers (2024-05-03T02:48:55Z)
Game Generation via Large Language Models [3.4051285393187327]
This paper investigates the game generation via large language models (LLMs) Based on video game description language, this paper proposes an LLM-based framework to generate game rules and levels simultaneously.
arXiv Detail & Related papers (2024-04-11T10:06:05Z)
GET: Unlocking the Multi-modal Potential of CLIP for Generalized Category Discovery [50.564146730579424]
We propose a Text Embedding Synthesizer (TES) to generate pseudo text embeddings for unlabelled samples.<n>Our method unlocks the multi-modal potentials of CLIP and outperforms the baseline methods by a large margin on all GCD benchmarks.
arXiv Detail & Related papers (2024-03-15T02:40:13Z)
Kosmos-G: Generating Images in Context with Multimodal Large Language Models [117.0259361818715]
Current subject-driven image generation methods require test-time tuning and cannot accept interleaved multi-image and text input. This paper presents Kosmos-G, a model that leverages the advanced multimodal perception capabilities of Multimodal Large Language Models. Kosmos-G demonstrates an impressive capability of zero-shot subject-driven generation with interleaved multi-image and text input.
arXiv Detail & Related papers (2023-10-04T17:28:44Z)
ZeroGen: Zero-shot Multimodal Controllable Text Generation with Multiple Oracles [29.460712493470453]
We propose a new paradigm of zero-shot controllable text generation with multimodal signals (textscZeroGen) textscZeroGen leverages controls of text and image successively from token-level to sentence-level and maps them into a unified probability space at decoding. We show that textscZeroGen not only outperforms its counterparts on captioning tasks by a large margin but also shows great potential in multimodal news generation with a higher degree of control.
arXiv Detail & Related papers (2023-06-29T03:22:43Z)
MGDoc: Pre-training with Multi-granular Hierarchy for Document Image Understanding [53.03978356918377]
spatial hierarchical relationships between content at different levels of granularity are crucial for document image understanding tasks. Existing methods learn features from either word-level or region-level but fail to consider both simultaneously. We propose MGDoc, a new multi-modal multi-granular pre-training framework that encodes page-level, region-level, and word-level information at the same time.
arXiv Detail & Related papers (2022-11-27T22:47:37Z)
Learning to Transfer Prompts for Text Generation [97.64625999380425]
We propose a novel prompt-based method (PTG) for text generation in a transferable setting. First, PTG learns a set of source prompts for various source generation tasks and then transfers these prompts as target prompts to perform target generation tasks. In extensive experiments, PTG yields competitive or better results than fine-tuning methods.
arXiv Detail & Related papers (2022-05-03T14:53:48Z)
TegTok: Augmenting Text Generation via Task-specific and Open-world Knowledge [83.55215993730326]
We propose augmenting TExt Generation via Task-specific and Open-world Knowledge (TegTok) in a unified framework. Our model selects knowledge entries from two types of knowledge sources through dense retrieval and then injects them into the input encoding and output decoding stages respectively.
arXiv Detail & Related papers (2022-03-16T10:37:59Z)
Experience-Driven PCG via Reinforcement Learning: A Super Mario Bros Study [2.2215852332444905]
The framework is tested initially in the Super Mario Bros game. The correctness of the generation is ensured by a neural net-assisted evolutionary level repairer. Our proposed framework is capable of generating endless, playable Super Mario Bros levels.
arXiv Detail & Related papers (2021-06-30T08:10:45Z)
GeDi: Generative Discriminator Guided Sequence Generation [53.15651536569169]
We propose GeDi as an efficient method for using smaller LMs as generative discriminators to guide generation from large LMs. We find that GeDi gives stronger controllability than the state of the art method while also achieving generation speeds more than 30 times faster.
arXiv Detail & Related papers (2020-09-14T17:45:36Z)
TOAD-GAN: Coherent Style Level Generation from a Single Example [24.039037782220017]
We present TOAD-GAN, a novel Procedural Content Generation (PCG) algorithm that generates token-based video game levels. We demonstrate its application for Super Mario Bros. levels and are able to generate new levels of similar style in arbitrary sizes.
arXiv Detail & Related papers (2020-08-04T13:44:50Z)
Capturing Local and Global Patterns in Procedural Content Generation via Machine Learning [9.697217570243845]
Recent procedural content generation via machine learning (PCGML) methods allow learning to produce similar content from existing content. It is an open questions how well these approaches can capture large-scale visual patterns such as symmetry. In this paper, we propose to match-three games as a domain to test PCGML algorithms regarding their ability to generate suitable patterns.
arXiv Detail & Related papers (2020-05-26T08:58:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.