Deep Surrogate Assisted Generation of Environments
- URL: http://arxiv.org/abs/2206.04199v1
- Date: Thu, 9 Jun 2022 00:14:03 GMT
- Title: Deep Surrogate Assisted Generation of Environments
- Authors: Varun Bhatt, Bryon Tjanaka, Matthew C. Fontaine, Stefanos Nikolaidis
- Abstract summary: Quality diversity (QD) optimization has been proven to be an effective component of environment generation algorithms.
We propose Deep Surrogate Assisted Generation of Environments (DSAGE), a sample-efficient QD environment generation algorithm.
Results in two benchmark domains show that DSAGE significantly outperforms existing QD environment generation algorithms.
- Score: 7.217405582720078
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Recent progress in reinforcement learning (RL) has started producing
generally capable agents that can solve a distribution of complex environments.
These agents are typically tested on fixed, human-authored environments. On the
other hand, quality diversity (QD) optimization has been proven to be an
effective component of environment generation algorithms, which can generate
collections of high-quality environments that are diverse in the resulting
agent behaviors. However, these algorithms require potentially expensive
simulations of agents on newly generated environments. We propose Deep
Surrogate Assisted Generation of Environments (DSAGE), a sample-efficient QD
environment generation algorithm that maintains a deep surrogate model for
predicting agent behaviors in new environments. Results in two benchmark
domains show that DSAGE significantly outperforms existing QD environment
generation algorithms in discovering collections of environments that elicit
diverse behaviors of a state-of-the-art RL agent and a planning agent.
Related papers
- Adversarial Environment Design via Regret-Guided Diffusion Models [13.651184780336623]
Training agents that are robust to environmental changes remains a significant challenge in deep reinforcement learning (RL)
Unsupervised environment design (UED) has recently emerged to address this issue by generating a set of training environments tailored to the agent's capabilities.
We propose a novel UED algorithm, adversarial environment design via regret-guided diffusion models (ADD)
arXiv Detail & Related papers (2024-10-25T17:35:03Z) - R-AIF: Solving Sparse-Reward Robotic Tasks from Pixels with Active Inference and World Models [50.19174067263255]
We introduce prior preference learning techniques and self-revision schedules to help the agent excel in sparse-reward, continuous action, goal-based robotic control POMDP environments.
We show that our agents offer improved performance over state-of-the-art models in terms of cumulative rewards, relative stability, and success rate.
arXiv Detail & Related papers (2024-09-21T18:32:44Z) - Caution for the Environment: Multimodal Agents are Susceptible to Environmental Distractions [68.92637077909693]
This paper investigates the faithfulness of multimodal large language model (MLLM) agents in the graphical user interface (GUI) environment.
A general setting is proposed where both the user and the agent are benign, and the environment, while not malicious, contains unrelated content.
Experimental results reveal that even the most powerful models, whether generalist agents or specialist GUI agents, are susceptible to distractions.
arXiv Detail & Related papers (2024-08-05T15:16:22Z) - HAZARD Challenge: Embodied Decision Making in Dynamically Changing
Environments [93.94020724735199]
HAZARD consists of three unexpected disaster scenarios, including fire, flood, and wind.
This benchmark enables us to evaluate autonomous agents' decision-making capabilities across various pipelines.
arXiv Detail & Related papers (2024-01-23T18:59:43Z) - Arbitrarily Scalable Environment Generators via Neural Cellular Automata [55.150593161240444]
We show that NCA environment generators maintain consistent, regularized patterns regardless of environment size.
Our method scales a single-agent reinforcement learning policy to arbitrarily large environments with similar patterns.
arXiv Detail & Related papers (2023-10-28T07:30:09Z) - Enhancing the Hierarchical Environment Design via Generative Trajectory
Modeling [8.256433006393243]
We introduce a hierarchical MDP framework for environment design under resource constraints.
It consists of an upper-level RL teacher agent that generates suitable training environments for a lower-level student agent.
Our proposed method significantly reduces the resource-intensive interactions between agents and environments.
arXiv Detail & Related papers (2023-09-30T08:21:32Z) - INTAGS: Interactive Agent-Guided Simulation [4.04638613278729]
In many applications involving multi-agent system (MAS), it is imperative to test an experimental (Exp) autonomous agent in a high-fidelity simulator prior to its deployment to production.
We propose a metric to distinguish between real and synthetic multi-agent systems, which is evaluated through the live interaction between the Exp and BG agents.
We show that using INTAGS to calibrate the simulator can generate more realistic market data compared to the state-of-the-art conditional Wasserstein Generative Adversarial Network approach.
arXiv Detail & Related papers (2023-09-04T19:56:18Z) - Adversarial Reinforcement Learning for Procedural Content Generation [0.3779860024918729]
We present an approach for procedural content generation (PCG) and improving generalization in reinforcement learning (RL) agents.
One popular approach is to procedurally generate different environments to increase the generalizability of the trained agents.
Here we deploy an adversarial model with one PCG RL agent and one solving RL agent.
arXiv Detail & Related papers (2021-03-08T15:51:42Z) - Emergent Complexity and Zero-shot Transfer via Unsupervised Environment
Design [121.73425076217471]
We propose Unsupervised Environment Design (UED), where developers provide environments with unknown parameters, and these parameters are used to automatically produce a distribution over valid, solvable environments.
We call our technique Protagonist Antagonist Induced Regret Environment Design (PAIRED)
Our experiments demonstrate that PAIRED produces a natural curriculum of increasingly complex environments, and PAIRED agents achieve higher zero-shot transfer performance when tested in highly novel environments.
arXiv Detail & Related papers (2020-12-03T17:37:01Z) - Unsupervised Domain Adaptation in Person re-ID via k-Reciprocal
Clustering and Large-Scale Heterogeneous Environment Synthesis [76.46004354572956]
We introduce an unsupervised domain adaptation approach for person re-identification.
Experimental results show that the proposed ktCUDA and SHRED approach achieves an average improvement of +5.7 mAP in re-identification performance.
arXiv Detail & Related papers (2020-01-14T17:43:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.