Adversarial Environment Design via Regret-Guided Diffusion Models
- URL: http://arxiv.org/abs/2410.19715v2
- Date: Fri, 15 Nov 2024 01:01:44 GMT
- Title: Adversarial Environment Design via Regret-Guided Diffusion Models
- Authors: Hojun Chung, Junseo Lee, Minsoo Kim, Dohyeong Kim, Songhwai Oh,
- Abstract summary: Training agents that are robust to environmental changes remains a significant challenge in deep reinforcement learning.
Unsupervised environment design (UED) has recently emerged to address this issue by generating a set of training environments tailored to the agent's capabilities.
We propose a novel UED algorithm, adversarial environment design via regret-guided diffusion models (ADD)
- Score: 13.651184780336623
- License:
- Abstract: Training agents that are robust to environmental changes remains a significant challenge in deep reinforcement learning (RL). Unsupervised environment design (UED) has recently emerged to address this issue by generating a set of training environments tailored to the agent's capabilities. While prior works demonstrate that UED has the potential to learn a robust policy, their performance is constrained by the capabilities of the environment generation. To this end, we propose a novel UED algorithm, adversarial environment design via regret-guided diffusion models (ADD). The proposed method guides the diffusion-based environment generator with the regret of the agent to produce environments that the agent finds challenging but conducive to further improvement. By exploiting the representation power of diffusion models, ADD can directly generate adversarial environments while maintaining the diversity of training environments, enabling the agent to effectively learn a robust policy. Our experimental results demonstrate that the proposed method successfully generates an instructive curriculum of environments, outperforming UED baselines in zero-shot generalization across novel, out-of-distribution environments. Project page: https://rllab-snu.github.io/projects/ADD
Related papers
- No Regrets: Investigating and Improving Regret Approximations for Curriculum Discovery [53.08822154199948]
Unsupervised Environment Design (UED) methods have gained recent attention as their adaptive curricula promise to enable agents to be robust to in- and out-of-distribution tasks.
This work investigates how existing UED methods select training environments, focusing on task prioritisation metrics.
We develop a method that directly trains on scenarios with high learnability.
arXiv Detail & Related papers (2024-08-27T14:31:54Z) - HAZARD Challenge: Embodied Decision Making in Dynamically Changing
Environments [93.94020724735199]
HAZARD consists of three unexpected disaster scenarios, including fire, flood, and wind.
This benchmark enables us to evaluate autonomous agents' decision-making capabilities across various pipelines.
arXiv Detail & Related papers (2024-01-23T18:59:43Z) - Enhancing the Hierarchical Environment Design via Generative Trajectory
Modeling [8.256433006393243]
We introduce a hierarchical MDP framework for environment design under resource constraints.
It consists of an upper-level RL teacher agent that generates suitable training environments for a lower-level student agent.
Our proposed method significantly reduces the resource-intensive interactions between agents and environments.
arXiv Detail & Related papers (2023-09-30T08:21:32Z) - Stabilizing Unsupervised Environment Design with a Learned Adversary [28.426666219969555]
Key challenge in training generally-capable agents is the design of training tasks that facilitate broad generalization and robustness to environment variations.
A pioneering approach for Unsupervised Environment Design (UED) is PAIRED, which uses reinforcement learning to train a teacher policy to design tasks from scratch.
Despite its strong theoretical backing, PAIRED suffers from a variety of challenges that hinder its practical performance.
We make it possible for PAIRED to match or exceed state-of-the-art methods, producing robust agents in several established challenging procedurally-generated environments.
arXiv Detail & Related papers (2023-08-21T15:42:56Z) - Free Lunch for Domain Adversarial Training: Environment Label Smoothing [82.85757548355566]
We propose Environment Label Smoothing (ELS) to improve training stability, local convergence, and robustness to noisy environment labels.
We yield state-of-art results on a wide range of domain generalization/adaptation tasks, particularly when the environment labels are highly noisy.
arXiv Detail & Related papers (2023-02-01T02:55:26Z) - Grounding Aleatoric Uncertainty in Unsupervised Environment Design [32.00797965770773]
In partially-observable settings, optimal policies may depend on the ground-truth distribution over aleatoric parameters of the environment.
We propose a minimax regret UED method that optimize the ground-truth utility function, even when the underlying training data is biased due to CICS.
arXiv Detail & Related papers (2022-07-11T22:45:29Z) - Deep Surrogate Assisted Generation of Environments [7.217405582720078]
Quality diversity (QD) optimization has been proven to be an effective component of environment generation algorithms.
We propose Deep Surrogate Assisted Generation of Environments (DSAGE), a sample-efficient QD environment generation algorithm.
Results in two benchmark domains show that DSAGE significantly outperforms existing QD environment generation algorithms.
arXiv Detail & Related papers (2022-06-09T00:14:03Z) - EnvEdit: Environment Editing for Vision-and-Language Navigation [98.30038910061894]
In Vision-and-Language Navigation (VLN), an agent needs to navigate through the environment based on natural language instructions.
We propose EnvEdit, a data augmentation method that creates new environments by editing existing environments.
We show that our proposed EnvEdit method gets significant improvements in all metrics on both pre-trained and non-pre-trained VLN agents.
arXiv Detail & Related papers (2022-03-29T15:44:32Z) - Emergent Complexity and Zero-shot Transfer via Unsupervised Environment
Design [121.73425076217471]
We propose Unsupervised Environment Design (UED), where developers provide environments with unknown parameters, and these parameters are used to automatically produce a distribution over valid, solvable environments.
We call our technique Protagonist Antagonist Induced Regret Environment Design (PAIRED)
Our experiments demonstrate that PAIRED produces a natural curriculum of increasingly complex environments, and PAIRED agents achieve higher zero-shot transfer performance when tested in highly novel environments.
arXiv Detail & Related papers (2020-12-03T17:37:01Z) - Environment Shaping in Reinforcement Learning using State Abstraction [63.444831173608605]
We propose a novel framework of emphenvironment shaping using state abstraction.
Our key idea is to compress the environment's large state space with noisy signals to an abstracted space.
We show that the agent's policy learnt in the shaped environment preserves near-optimal behavior in the original environment.
arXiv Detail & Related papers (2020-06-23T17:00:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.