Evolutionary Reinforcement Learning Dynamics with Irreducible
Environmental Uncertainty
- URL: http://arxiv.org/abs/2109.07259v1
- Date: Wed, 15 Sep 2021 12:50:58 GMT
- Title: Evolutionary Reinforcement Learning Dynamics with Irreducible
Environmental Uncertainty
- Authors: Wolfram Barfuss and Richard P. Mann
- Abstract summary: We derive and present evolutionary reinforcement learning dynamics in which the agents are irreducibly uncertain about the current state of the environment.
We find that irreducible environmental uncertainty can lead to better learning outcomes faster, stabilize the learning process and overcome social dilemmas.
However, we do also find that partial observability may cause worse learning outcomes, for example, in the form of a catastrophic limit cycle.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this work we derive and present evolutionary reinforcement learning
dynamics in which the agents are irreducibly uncertain about the current state
of the environment. We evaluate the dynamics across different classes of
partially observable agent-environment systems and find that irreducible
environmental uncertainty can lead to better learning outcomes faster,
stabilize the learning process and overcome social dilemmas. However, as
expected, we do also find that partial observability may cause worse learning
outcomes, for example, in the form of a catastrophic limit cycle. Compared to
fully observant agents, learning with irreducible environmental uncertainty
often requires more exploration and less weight on future rewards to obtain the
best learning outcomes. Furthermore, we find a range of dynamical effects
induced by partial observability, e.g., a critical slowing down of the learning
processes between reward regimes and the separation of the learning dynamics
into fast and slow directions. The presented dynamics are a practical tool for
researchers in biology, social science and machine learning to systematically
investigate the evolutionary effects of environmental uncertainty.
Related papers
- Variable-Agnostic Causal Exploration for Reinforcement Learning [56.52768265734155]
We introduce a novel framework, Variable-Agnostic Causal Exploration for Reinforcement Learning (VACERL)
Our approach automatically identifies crucial observation-action steps associated with key variables using attention mechanisms.
It constructs the causal graph connecting these steps, which guides the agent towards observation-action pairs with greater causal influence on task completion.
arXiv Detail & Related papers (2024-07-17T09:45:27Z) - Network bottlenecks and task structure control the evolution of interpretable learning rules in a foraging agent [0.0]
We study meta-learning via evolutionary optimization of simple reward-modulated plasticity rules in embodied agents.
We show that unconstrained meta-learning leads to the emergence of diverse plasticity rules.
Our findings indicate that the meta-learning of plasticity rules is very sensitive to various parameters, with this sensitivity possibly reflected in the learning rules found in biological networks.
arXiv Detail & Related papers (2024-03-20T14:57:02Z) - Disentangling the Causes of Plasticity Loss in Neural Networks [55.23250269007988]
We show that loss of plasticity can be decomposed into multiple independent mechanisms.
We show that a combination of layer normalization and weight decay is highly effective at maintaining plasticity in a variety of synthetic nonstationary learning tasks.
arXiv Detail & Related papers (2024-02-29T00:02:33Z) - Environment Design for Inverse Reinforcement Learning [3.085995273374333]
Current inverse reinforcement learning methods that focus on learning from a single environment can fail to handle slight changes in the environment dynamics.
In our framework, the learner repeatedly interacts with the expert, with the former selecting environments to identify the reward function.
This results in improvements in both sample-efficiency and robustness, as we show experimentally, for both exact and approximate inference.
arXiv Detail & Related papers (2022-10-26T18:31:17Z) - Critical Learning Periods for Multisensory Integration in Deep Networks [112.40005682521638]
We show that the ability of a neural network to integrate information from diverse sources hinges critically on being exposed to properly correlated signals during the early phases of training.
We show that critical periods arise from the complex and unstable early transient dynamics, which are decisive of final performance of the trained system and their learned representations.
arXiv Detail & Related papers (2022-10-06T23:50:38Z) - Robust Imitation Learning against Variations in Environment Dynamics [17.15933046951096]
We propose a robust imitation learning (IL) framework that improves the robustness of IL when environment dynamics are perturbed.
Our framework effectively deals with environments with varying dynamics by imitating multiple experts in sampled environment dynamics.
arXiv Detail & Related papers (2022-06-19T03:06:13Z) - Variational Dynamic for Self-Supervised Exploration in Deep Reinforcement Learning [12.76337275628074]
In this work, we propose a variational dynamic model based on the conditional variational inference to model the multimodality andgenerativeity.
We derive an upper bound of the negative log-likelihood of the environmental transition and use such an upper bound as the intrinsic reward for exploration.
Our method outperforms several state-of-the-art environment model-based exploration approaches.
arXiv Detail & Related papers (2020-10-17T09:54:51Z) - Tracking Emotions: Intrinsic Motivation Grounded on Multi-Level
Prediction Error Dynamics [68.8204255655161]
We discuss how emotions arise when differences between expected and actual rates of progress towards a goal are experienced.
We present an intrinsic motivation architecture that generates behaviors towards self-generated and dynamic goals.
arXiv Detail & Related papers (2020-07-29T06:53:13Z) - Ecological Reinforcement Learning [76.9893572776141]
We study the kinds of environment properties that can make learning under such conditions easier.
understanding how properties of the environment impact the performance of reinforcement learning agents can help us to structure our tasks in ways that make learning tractable.
arXiv Detail & Related papers (2020-06-22T17:55:03Z) - Deep Reinforcement Learning amidst Lifelong Non-Stationarity [67.24635298387624]
We show that an off-policy RL algorithm can reason about and tackle lifelong non-stationarity.
Our method leverages latent variable models to learn a representation of the environment from current and past experiences.
We also introduce several simulation environments that exhibit lifelong non-stationarity, and empirically find that our approach substantially outperforms approaches that do not reason about environment shift.
arXiv Detail & Related papers (2020-06-18T17:34:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.