Generalization through Diversity: Improving Unsupervised Environment
Design
- URL: http://arxiv.org/abs/2301.08025v2
- Date: Tue, 19 Sep 2023 03:27:44 GMT
- Title: Generalization through Diversity: Improving Unsupervised Environment
Design
- Authors: Wenjun Li, Pradeep Varakantham, Dexun Li
- Abstract summary: We propose a principled approach to adaptively identify diverse environments based on a novel distance measure relevant to environment design.
We empirically demonstrate the versatility and effectiveness of our method in comparison to multiple leading approaches for unsupervised environment design.
- Score: 8.961693126230452
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Agent decision making using Reinforcement Learning (RL) heavily relies on
either a model or simulator of the environment (e.g., moving in an 8x8 maze
with three rooms, playing Chess on an 8x8 board). Due to this dependence, small
changes in the environment (e.g., positions of obstacles in the maze, size of
the board) can severely affect the effectiveness of the policy learned by the
agent. To that end, existing work has proposed training RL agents on an
adaptive curriculum of environments (generated automatically) to improve
performance on out-of-distribution (OOD) test scenarios. Specifically, existing
research has employed the potential for the agent to learn in an environment
(captured using Generalized Advantage Estimation, GAE) as the key factor to
select the next environment(s) to train the agent. However, such a mechanism
can select similar environments (with a high potential to learn) thereby making
agent training redundant on all but one of those environments. To that end, we
provide a principled approach to adaptively identify diverse environments based
on a novel distance measure relevant to environment design. We empirically
demonstrate the versatility and effectiveness of our method in comparison to
multiple leading approaches for unsupervised environment design on three
distinct benchmark problems used in literature.
Related papers
- Towards Generalizable Reinforcement Learning via Causality-Guided Self-Adaptive Representations [22.6449779859417]
General intelligence requires quick adaption across tasks.
In this paper, we explore a wider range of scenarios where not only the distribution but also the environment spaces may change.
We introduce a causality-guided self-adaptive representation-based approach, called CSR, that equips the agent to generalize effectively.
arXiv Detail & Related papers (2024-07-30T08:48:49Z) - HAZARD Challenge: Embodied Decision Making in Dynamically Changing
Environments [93.94020724735199]
HAZARD consists of three unexpected disaster scenarios, including fire, flood, and wind.
This benchmark enables us to evaluate autonomous agents' decision-making capabilities across various pipelines.
arXiv Detail & Related papers (2024-01-23T18:59:43Z) - Diversity Induced Environment Design via Self-Play [9.172096093540357]
We propose a task-agnostic method to identify observed/hidden states that are representative of a given level.
The outcome of this method is then utilized to characterize the diversity between two levels, which as we show can be crucial to effective performance.
In addition, to improve sampling efficiency, we incorporate the self-play technique that allows the environment generator to automatically generate environments that are of great benefit to the training agent.
arXiv Detail & Related papers (2023-02-04T07:31:36Z) - Environment Optimization for Multi-Agent Navigation [11.473177123332281]
The goal of this paper is to consider the environment as a decision variable in a system-level optimization problem.
We show, through formal proofs, under which conditions the environment can change while guaranteeing completeness.
In order to accommodate a broad range of implementation scenarios, we include both online and offline optimization, and both discrete and continuous environment representations.
arXiv Detail & Related papers (2022-09-22T19:22:16Z) - AACC: Asymmetric Actor-Critic in Contextual Reinforcement Learning [13.167123175701802]
This paper formalizes the task of adapting to changing environmental dynamics in Reinforcement Learning (RL)
We then propose the Asymmetric Actor-Critic in Contextual RL (AACC) as an end-to-end actor-critic method to deal with such generalization tasks.
We demonstrate the essential improvements in the performance of AACC over existing baselines experimentally in a range of simulated environments.
arXiv Detail & Related papers (2022-08-03T22:52:26Z) - Improving adaptability to new environments and removing catastrophic
forgetting in Reinforcement Learning by using an eco-system of agents [3.5786621294068373]
Adapting a Reinforcement Learning (RL) agent to an unseen environment is a difficult task due to typical over-fitting on the training environment.
There is a risk of catastrophic forgetting, where the performance on previously seen environments is seriously hampered.
This paper proposes a novel approach that exploits an ecosystem of agents to address both concerns.
arXiv Detail & Related papers (2022-04-13T17:52:54Z) - Scenic4RL: Programmatic Modeling and Generation of Reinforcement
Learning Environments [89.04823188871906]
Generation of diverse realistic scenarios is challenging for real-time strategy (RTS) environments.
Most of the existing simulators rely on randomly generating the environments.
We introduce the benefits of adopting an existing formal scenario specification language, SCENIC, to assist researchers.
arXiv Detail & Related papers (2021-06-18T21:49:46Z) - Emergent Complexity and Zero-shot Transfer via Unsupervised Environment
Design [121.73425076217471]
We propose Unsupervised Environment Design (UED), where developers provide environments with unknown parameters, and these parameters are used to automatically produce a distribution over valid, solvable environments.
We call our technique Protagonist Antagonist Induced Regret Environment Design (PAIRED)
Our experiments demonstrate that PAIRED produces a natural curriculum of increasingly complex environments, and PAIRED agents achieve higher zero-shot transfer performance when tested in highly novel environments.
arXiv Detail & Related papers (2020-12-03T17:37:01Z) - Self-Supervised Policy Adaptation during Deployment [98.25486842109936]
Self-supervision allows the policy to continue training after deployment without using any rewards.
Empirical evaluations are performed on diverse simulation environments from DeepMind Control suite and ViZDoom.
Our method improves generalization in 31 out of 36 environments across various tasks and outperforms domain randomization on a majority of environments.
arXiv Detail & Related papers (2020-07-08T17:56:27Z) - Environment Shaping in Reinforcement Learning using State Abstraction [63.444831173608605]
We propose a novel framework of emphenvironment shaping using state abstraction.
Our key idea is to compress the environment's large state space with noisy signals to an abstracted space.
We show that the agent's policy learnt in the shaped environment preserves near-optimal behavior in the original environment.
arXiv Detail & Related papers (2020-06-23T17:00:22Z) - Ecological Reinforcement Learning [76.9893572776141]
We study the kinds of environment properties that can make learning under such conditions easier.
understanding how properties of the environment impact the performance of reinforcement learning agents can help us to structure our tasks in ways that make learning tractable.
arXiv Detail & Related papers (2020-06-22T17:55:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.