Adversarial Reinforcement Learning for Procedural Content Generation
- URL: http://arxiv.org/abs/2103.04847v1
- Date: Mon, 8 Mar 2021 15:51:42 GMT
- Title: Adversarial Reinforcement Learning for Procedural Content Generation
- Authors: Linus Gissl\'en, Andy Eakins, Camilo Gordillo, Joakim Bergdahl, Konrad
Tollmar
- Abstract summary: We present an approach for procedural content generation (PCG) and improving generalization in reinforcement learning (RL) agents.
One popular approach is to procedurally generate different environments to increase the generalizability of the trained agents.
Here we deploy an adversarial model with one PCG RL agent and one solving RL agent.
- Score: 0.3779860024918729
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present an approach for procedural content generation (PCG), and improving
generalization in reinforcement learning (RL) agents, by using adversarial deep
RL. Training RL agents for generalization over novel environments is a
notoriously difficult task. One popular approach is to procedurally generate
different environments to increase the generalizability of the trained agents.
Here we deploy an adversarial model with one PCG RL agent (called Generator),
and one solving RL agent (called Solver). The benefit is mainly two-fold:
Firstly, the Solver achieves better generalization through the generated
challenges from the Generator. Secondly, the trained Generator can be used as a
creator of novel environments that, together with the Solver, can be shown to
be solvable. The Generator receives a reward signal based on the performance of
the Solver which encourages the environment design to be challenging but not
impossible. To further drive diversity and control of the environment
generation, we propose the use of auxiliary inputs for the Generator. Thus, we
propose adversarial RL for procedural content generation (ARLPCG), an
adversarial approach which procedurally generates previously unseen
environments with an auxiliary input as a control variable. Herein we describe
this concept in detail and compare it with previous methods showing improved
generalization, as well as a new method to create novel environments.
Related papers
- Adversarial Environment Design via Regret-Guided Diffusion Models [13.651184780336623]
Training agents that are robust to environmental changes remains a significant challenge in deep reinforcement learning (RL)
Unsupervised environment design (UED) has recently emerged to address this issue by generating a set of training environments tailored to the agent's capabilities.
We propose a novel UED algorithm, adversarial environment design via regret-guided diffusion models (ADD)
arXiv Detail & Related papers (2024-10-25T17:35:03Z) - DRED: Zero-Shot Transfer in Reinforcement Learning via Data-Regularised Environment Design [11.922951794283168]
In this work, we investigate how the sampling of individual environment instances, or levels, affects the zero-shot generalisation (ZSG) ability of RL agents.
We discover that for deep actor-critic architectures sharing their base layers, prioritising levels according to their value loss minimises the mutual information between the agent's internal representation and the set of training levels in the generated training data.
We find that existing UED methods can significantly shift the training distribution, which translates to low ZSG performance.
To prevent both overfitting and distributional shift, we introduce data-regularised environment design (D
arXiv Detail & Related papers (2024-02-05T19:47:45Z) - CCA: Collaborative Competitive Agents for Image Editing [59.54347952062684]
This paper presents a novel generative model, Collaborative Competitive Agents (CCA)
It leverages the capabilities of multiple Large Language Models (LLMs) based agents to execute complex tasks.
The paper's main contributions include the introduction of a multi-agent-based generative model with controllable intermediate steps and iterative optimization.
arXiv Detail & Related papers (2024-01-23T11:46:28Z) - Controlling the Latent Space of GANs through Reinforcement Learning: A
Case Study on Task-based Image-to-Image Translation [5.881800919492065]
Generative Adversarial Networks (GAN) have emerged as a formidable AI tool to generate realistic outputs based on training datasets.
We propose a novel methodology to address this issue by integrating a reinforcement learning (RL) agent with a latent-space GAN (l-GAN)
We have developed an actor-critic RL agent with a meticulously designed reward policy, enabling it to acquire proficiency in navigating the latent space of the l-GAN.
arXiv Detail & Related papers (2023-07-26T06:34:24Z) - Mastering the Unsupervised Reinforcement Learning Benchmark from Pixels [112.63440666617494]
Reinforcement learning algorithms can succeed but require large amounts of interactions between the agent and the environment.
We propose a new method to solve it, using unsupervised model-based RL, for pre-training the agent.
We show robust performance on the Real-Word RL benchmark, hinting at resiliency to environment perturbations during adaptation.
arXiv Detail & Related papers (2022-09-24T14:22:29Z) - Deep Surrogate Assisted Generation of Environments [7.217405582720078]
Quality diversity (QD) optimization has been proven to be an effective component of environment generation algorithms.
We propose Deep Surrogate Assisted Generation of Environments (DSAGE), a sample-efficient QD environment generation algorithm.
Results in two benchmark domains show that DSAGE significantly outperforms existing QD environment generation algorithms.
arXiv Detail & Related papers (2022-06-09T00:14:03Z) - A Game-Theoretic Approach for Improving Generalization Ability of TSP
Solvers [16.98434288039677]
We introduce a two-player zerosum framework between a trainable emphr and a emphData Generator.
We show that our framework achieves the most generalizable performance on different Traveling Salesman Problems tasks.
arXiv Detail & Related papers (2021-10-28T13:35:22Z) - Controllable and Compositional Generation with Latent-Space Energy-Based
Models [60.87740144816278]
Controllable generation is one of the key requirements for successful adoption of deep generative models in real-world applications.
In this work, we use energy-based models (EBMs) to handle compositional generation over a set of attributes.
By composing energy functions with logical operators, this work is the first to achieve such compositionality in generating photo-realistic images of resolution 1024x1024.
arXiv Detail & Related papers (2021-10-21T03:31:45Z) - Emergent Complexity and Zero-shot Transfer via Unsupervised Environment
Design [121.73425076217471]
We propose Unsupervised Environment Design (UED), where developers provide environments with unknown parameters, and these parameters are used to automatically produce a distribution over valid, solvable environments.
We call our technique Protagonist Antagonist Induced Regret Environment Design (PAIRED)
Our experiments demonstrate that PAIRED produces a natural curriculum of increasingly complex environments, and PAIRED agents achieve higher zero-shot transfer performance when tested in highly novel environments.
arXiv Detail & Related papers (2020-12-03T17:37:01Z) - Dynamics Generalization via Information Bottleneck in Deep Reinforcement
Learning [90.93035276307239]
We propose an information theoretic regularization objective and an annealing-based optimization method to achieve better generalization ability in RL agents.
We demonstrate the extreme generalization benefits of our approach in different domains ranging from maze navigation to robotic tasks.
This work provides a principled way to improve generalization in RL by gradually removing information that is redundant for task-solving.
arXiv Detail & Related papers (2020-08-03T02:24:20Z) - Unsupervised Controllable Generation with Self-Training [90.04287577605723]
controllable generation with GANs remains a challenging research problem.
We propose an unsupervised framework to learn a distribution of latent codes that control the generator through self-training.
Our framework exhibits better disentanglement compared to other variants such as the variational autoencoder.
arXiv Detail & Related papers (2020-07-17T21:50:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.