Improving adaptability to new environments and removing catastrophic
forgetting in Reinforcement Learning by using an eco-system of agents
- URL: http://arxiv.org/abs/2204.06550v1
- Date: Wed, 13 Apr 2022 17:52:54 GMT
- Title: Improving adaptability to new environments and removing catastrophic
forgetting in Reinforcement Learning by using an eco-system of agents
- Authors: Olivier Moulin, Vincent Francois-Lavet, Paul Elbers, Mark Hoogendoorn
- Abstract summary: Adapting a Reinforcement Learning (RL) agent to an unseen environment is a difficult task due to typical over-fitting on the training environment.
There is a risk of catastrophic forgetting, where the performance on previously seen environments is seriously hampered.
This paper proposes a novel approach that exploits an ecosystem of agents to address both concerns.
- Score: 3.5786621294068373
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Adapting a Reinforcement Learning (RL) agent to an unseen environment is a
difficult task due to typical over-fitting on the training environment. RL
agents are often capable of solving environments very close to the trained
environment, but when environments become substantially different, their
performance quickly drops. When agents are retrained on new environments, a
second issue arises: there is a risk of catastrophic forgetting, where the
performance on previously seen environments is seriously hampered. This paper
proposes a novel approach that exploits an ecosystem of agents to address both
concerns. Hereby, the (limited) adaptive power of individual agents is
harvested to build a highly adaptive ecosystem. This allows to transfer part of
the workload from learning to inference. An evaluation of the approach on two
distinct distributions of environments shows that our approach outperforms
state-of-the-art techniques in terms of adaptability/generalization as well as
avoids catastrophic forgetting.
Related papers
- Survival of the Fittest: Evolutionary Adaptation of Policies for Environmental Shifts [0.15889427269227555]
We develop an adaptive re-training algorithm inspired by evolutionary game theory (EGT)
ERPO shows faster policy adaptation, higher average rewards, and reduced computational costs in policy adaptation.
arXiv Detail & Related papers (2024-10-22T09:29:53Z) - No Regrets: Investigating and Improving Regret Approximations for Curriculum Discovery [53.08822154199948]
Unsupervised Environment Design (UED) methods have gained recent attention as their adaptive curricula promise to enable agents to be robust to in- and out-of-distribution tasks.
This work investigates how existing UED methods select training environments, focusing on task prioritisation metrics.
We develop a method that directly trains on scenarios with high learnability.
arXiv Detail & Related papers (2024-08-27T14:31:54Z) - Generalization through Diversity: Improving Unsupervised Environment
Design [8.961693126230452]
We propose a principled approach to adaptively identify diverse environments based on a novel distance measure relevant to environment design.
We empirically demonstrate the versatility and effectiveness of our method in comparison to multiple leading approaches for unsupervised environment design.
arXiv Detail & Related papers (2023-01-19T11:55:47Z) - AACC: Asymmetric Actor-Critic in Contextual Reinforcement Learning [13.167123175701802]
This paper formalizes the task of adapting to changing environmental dynamics in Reinforcement Learning (RL)
We then propose the Asymmetric Actor-Critic in Contextual RL (AACC) as an end-to-end actor-critic method to deal with such generalization tasks.
We demonstrate the essential improvements in the performance of AACC over existing baselines experimentally in a range of simulated environments.
arXiv Detail & Related papers (2022-08-03T22:52:26Z) - Emergent Complexity and Zero-shot Transfer via Unsupervised Environment
Design [121.73425076217471]
We propose Unsupervised Environment Design (UED), where developers provide environments with unknown parameters, and these parameters are used to automatically produce a distribution over valid, solvable environments.
We call our technique Protagonist Antagonist Induced Regret Environment Design (PAIRED)
Our experiments demonstrate that PAIRED produces a natural curriculum of increasingly complex environments, and PAIRED agents achieve higher zero-shot transfer performance when tested in highly novel environments.
arXiv Detail & Related papers (2020-12-03T17:37:01Z) - Cautious Adaptation For Reinforcement Learning in Safety-Critical
Settings [129.80279257258098]
Reinforcement learning (RL) in real-world safety-critical target settings like urban driving is hazardous.
We propose a "safety-critical adaptation" task setting: an agent first trains in non-safety-critical "source" environments.
We propose a solution approach, CARL, that builds on the intuition that prior experience in diverse environments equips an agent to estimate risk.
arXiv Detail & Related papers (2020-08-15T01:40:59Z) - Self-Supervised Policy Adaptation during Deployment [98.25486842109936]
Self-supervision allows the policy to continue training after deployment without using any rewards.
Empirical evaluations are performed on diverse simulation environments from DeepMind Control suite and ViZDoom.
Our method improves generalization in 31 out of 36 environments across various tasks and outperforms domain randomization on a majority of environments.
arXiv Detail & Related papers (2020-07-08T17:56:27Z) - Environment Shaping in Reinforcement Learning using State Abstraction [63.444831173608605]
We propose a novel framework of emphenvironment shaping using state abstraction.
Our key idea is to compress the environment's large state space with noisy signals to an abstracted space.
We show that the agent's policy learnt in the shaped environment preserves near-optimal behavior in the original environment.
arXiv Detail & Related papers (2020-06-23T17:00:22Z) - Ecological Reinforcement Learning [76.9893572776141]
We study the kinds of environment properties that can make learning under such conditions easier.
understanding how properties of the environment impact the performance of reinforcement learning agents can help us to structure our tasks in ways that make learning tractable.
arXiv Detail & Related papers (2020-06-22T17:55:03Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.