Understanding and Preventing Capacity Loss in Reinforcement Learning
- URL: http://arxiv.org/abs/2204.09560v1
- Date: Wed, 20 Apr 2022 15:55:15 GMT
- Title: Understanding and Preventing Capacity Loss in Reinforcement Learning
- Authors: Clare Lyle, Mark Rowland, Will Dabney
- Abstract summary: We identify a mechanism by which non-stationary prediction targets can prevent learning progress in deep RL agents.
Capacity loss occurs in a range of RL agents and environments, and is particularly damaging to performance in sparse-reward tasks.
- Score: 28.52122927103544
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The reinforcement learning (RL) problem is rife with sources of
non-stationarity, making it a notoriously difficult problem domain for the
application of neural networks. We identify a mechanism by which non-stationary
prediction targets can prevent learning progress in deep RL agents:
\textit{capacity loss}, whereby networks trained on a sequence of target values
lose their ability to quickly update their predictions over time. We
demonstrate that capacity loss occurs in a range of RL agents and environments,
and is particularly damaging to performance in sparse-reward tasks. We then
present a simple regularizer, Initial Feature Regularization (InFeR), that
mitigates this phenomenon by regressing a subspace of features towards its
value at initialization, leading to significant performance improvements in
sparse-reward environments such as Montezuma's Revenge. We conclude that
preventing capacity loss is crucial to enable agents to maximally benefit from
the learning signals they obtain throughout the entire training trajectory.
Related papers
- Mind the Interference: Retaining Pre-trained Knowledge in Parameter Efficient Continual Learning of Vision-Language Models [79.28821338925947]
Domain-Class Incremental Learning is a realistic but challenging continual learning scenario.
To handle these diverse tasks, pre-trained Vision-Language Models (VLMs) are introduced for their strong generalizability.
This incurs a new problem: the knowledge encoded in the pre-trained VLMs may be disturbed when adapting to new tasks, compromising their inherent zero-shot ability.
Existing methods tackle it by tuning VLMs with knowledge distillation on extra datasets, which demands heavy overhead.
We propose the Distribution-aware Interference-free Knowledge Integration (DIKI) framework, retaining pre-trained knowledge of
arXiv Detail & Related papers (2024-07-07T12:19:37Z) - Learning Dynamics and Generalization in Reinforcement Learning [59.530058000689884]
We show theoretically that temporal difference learning encourages agents to fit non-smooth components of the value function early in training.
We show that neural networks trained using temporal difference algorithms on dense reward tasks exhibit weaker generalization between states than randomly networks and gradient networks trained with policy methods.
arXiv Detail & Related papers (2022-06-05T08:49:16Z) - Generalization, Mayhems and Limits in Recurrent Proximal Policy
Optimization [1.8570591025615453]
We highlight vital details that one must get right when adding recurrence to achieve a correct and efficient implementation.
We explore the limitations of recurrent PPO by the benchmarking contributed novel environments Mortar Mayhem and Searing Spotlights.
Remarkably, we can demonstrate a transition to strong generalization in Mortar Mayhem when scaling the number of training seeds.
arXiv Detail & Related papers (2022-05-23T07:54:15Z) - The Impact of Activation Sparsity on Overfitting in Convolutional Neural
Networks [1.9424280683610138]
Overfitting is one of the fundamental challenges when training convolutional neural networks.
In this study we introduce a perplexity-based sparsity definition to derive and visualise layer-wise activation measures.
arXiv Detail & Related papers (2021-04-13T12:55:37Z) - Dynamics Generalization via Information Bottleneck in Deep Reinforcement
Learning [90.93035276307239]
We propose an information theoretic regularization objective and an annealing-based optimization method to achieve better generalization ability in RL agents.
We demonstrate the extreme generalization benefits of our approach in different domains ranging from maze navigation to robotic tasks.
This work provides a principled way to improve generalization in RL by gradually removing information that is redundant for task-solving.
arXiv Detail & Related papers (2020-08-03T02:24:20Z) - Untangling tradeoffs between recurrence and self-attention in neural
networks [81.30894993852813]
We present a formal analysis of how self-attention affects gradient propagation in recurrent networks.
We prove that it mitigates the problem of vanishing gradients when trying to capture long-term dependencies.
We propose a relevancy screening mechanism that allows for a scalable use of sparse self-attention with recurrence.
arXiv Detail & Related papers (2020-06-16T19:24:25Z) - Transient Non-Stationarity and Generalisation in Deep Reinforcement
Learning [67.34810824996887]
Non-stationarity can arise in Reinforcement Learning (RL) even in stationary environments.
We propose Iterated Relearning (ITER) to improve generalisation of deep RL agents.
arXiv Detail & Related papers (2020-06-10T13:26:31Z) - Feature Purification: How Adversarial Training Performs Robust Deep
Learning [66.05472746340142]
We show a principle that we call Feature Purification, where we show one of the causes of the existence of adversarial examples is the accumulation of certain small dense mixtures in the hidden weights during the training process of a neural network.
We present both experiments on the CIFAR-10 dataset to illustrate this principle, and a theoretical result proving that for certain natural classification tasks, training a two-layer neural network with ReLU activation using randomly gradient descent indeed this principle.
arXiv Detail & Related papers (2020-05-20T16:56:08Z) - Exploiting the Full Capacity of Deep Neural Networks while Avoiding
Overfitting by Targeted Sparsity Regularization [1.3764085113103217]
Overfitting is one of the most common problems when training deep neural networks on comparatively small datasets.
We propose novel targeted sparsity visualization and regularization strategies to counteract overfitting.
arXiv Detail & Related papers (2020-02-21T11:38:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.