Related papers: Spectral Normalisation for Deep Reinforcement Learning: an Optimisation Perspective

Spectral Normalisation for Deep Reinforcement Learning: an Optimisation Perspective

URL: http://arxiv.org/abs/2105.05246v1
Date: Tue, 11 May 2021 17:59:46 GMT
Title: Spectral Normalisation for Deep Reinforcement Learning: an Optimisation Perspective
Authors: Florin Gogianu and Tudor Berariu, Mihaela Rosca, Claudia Clopath, Lucian Busoniu, Razvan Pascanu
Abstract summary: We show we can recover the performance of developments not by changing the objective, but by regularising the value-function estimator. We conduct ablation studies to disentangle the various effects normalisation has on the learning dynamics. These findings hint towards the need to also focus on the neural component and its learning dynamics to tackle the peculiarities of Deep Reinforcement Learning.
Score: 22.625456135981292
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Most of the recent deep reinforcement learning advances take an RL-centric perspective and focus on refinements of the training objective. We diverge from this view and show we can recover the performance of these developments not by changing the objective, but by regularising the value-function estimator. Constraining the Lipschitz constant of a single layer using spectral normalisation is sufficient to elevate the performance of a Categorical-DQN agent to that of a more elaborated \rainbow{} agent on the challenging Atari domain. We conduct ablation studies to disentangle the various effects normalisation has on the learning dynamics and show that is sufficient to modulate the parameter updates to recover most of the performance of spectral normalisation. These findings hint towards the need to also focus on the neural component and its learning dynamics to tackle the peculiarities of Deep Reinforcement Learning.

Related papers

In-Context Linear Regression Demystified: Training Dynamics and Mechanistic Interpretability of Multi-Head Softmax Attention [52.159541540613915]
We study how multi-head softmax attention models are trained to perform in-context learning on linear data. Our results reveal that in-context learning ability emerges from the trained transformer as an aggregated effect of its architecture and the underlying data distribution.
arXiv Detail & Related papers (2025-03-17T02:00:49Z)
PEARL: Preconditioner Enhancement through Actor-critic Reinforcement Learning [5.433548785820674]
We present PEARL (Preconditioner Enhancement through Actor-critic Reinforcement Learning), a novel approach to learning matrix preconditioners. Recent advances have explored using deep neural networks to learn preconditioners, though challenges such as misbehaved objective functions and costly training procedures remain.
arXiv Detail & Related papers (2025-01-18T12:19:18Z)
Point-Calibrated Spectral Neural Operators [54.13671100638092]
We introduce Point-Calibrated Spectral Transform, which learns operator mappings by approximating functions with the point-level adaptive spectral basis. Point-Calibrated Spectral Neural Operators learn operator mappings by approximating functions with the point-level adaptive spectral basis.
arXiv Detail & Related papers (2024-10-15T08:19:39Z)
Normalization and effective learning rates in reinforcement learning [52.59508428613934]
Normalization layers have recently experienced a renaissance in the deep reinforcement learning and continual learning literature. We show that normalization brings with it a subtle but important side effect: an equivalence between growth in the norm of the network parameters and decay in the effective learning rate. We propose to make the learning rate schedule explicit with a simple re- parameterization which we call Normalize-and-Project.
arXiv Detail & Related papers (2024-07-01T20:58:01Z)
Learning Continually by Spectral Regularization [45.55508032009977]
Continual learning algorithms seek to mitigate loss of plasticity by sustaining good performance while maintaining network trainability. We develop a new technique for improving continual learning inspired by the observation that the singular values of the neural network parameters at initialization are an important factor for trainability during early phases of learning. We present an experimental analysis that shows how the proposed spectral regularizer can sustain trainability and performance across a range of model architectures in continual supervised and reinforcement learning settings.
arXiv Detail & Related papers (2024-06-10T21:34:43Z)
Rich-Observation Reinforcement Learning with Continuous Latent Dynamics [43.84391209459658]
We introduce a new theoretical framework, RichCLD (Rich-Observation RL with Continuous Latent Dynamics), in which the agent performs control based on high-dimensional observations. Our main contribution is a new algorithm for this setting that is provably statistically and computationally efficient.
arXiv Detail & Related papers (2024-05-29T17:02:49Z)
Self-STORM: Deep Unrolled Self-Supervised Learning for Super-Resolution Microscopy [55.2480439325792]
We introduce deep unrolled self-supervised learning, which alleviates the need for such data by training a sequence-specific, model-based autoencoder. Our proposed method exceeds the performance of its supervised counterparts.
arXiv Detail & Related papers (2024-03-25T17:40:32Z)
A Model-Based Approach for Improving Reinforcement Learning Efficiency Leveraging Expert Observations [9.240917262195046]
We propose an algorithm that automatically adjusts the weights of each component in the augmented loss function. Experiments on a variety of continuous control tasks demonstrate that the proposed algorithm outperforms various benchmarks.
arXiv Detail & Related papers (2024-02-29T03:53:02Z)
Learning Dynamics and Generalization in Reinforcement Learning [59.530058000689884]
We show theoretically that temporal difference learning encourages agents to fit non-smooth components of the value function early in training. We show that neural networks trained using temporal difference algorithms on dense reward tasks exhibit weaker generalization between states than randomly networks and gradient networks trained with policy methods.
arXiv Detail & Related papers (2022-06-05T08:49:16Z)
A Loss Curvature Perspective on Training Instability in Deep Learning [28.70491071044542]
We study the evolution of the loss Hessian across many classification tasks in order to understand the effect curvature of the loss has on the training dynamics. Inspired by the conditioning perspective, we show that learning rate warmup can improve training stability just as much as batch normalization.
arXiv Detail & Related papers (2021-10-08T20:25:48Z)
Neurally Augmented ALISTA [15.021419552695066]
We introduce Neurally Augmented ALISTA, in which an LSTM network is used to compute step sizes and thresholds individually for each target vector during reconstruction. We show that our approach further improves empirical performance in sparse reconstruction, in particular outperforming existing algorithms by an increasing margin as the compression ratio becomes more challenging.
arXiv Detail & Related papers (2020-10-05T11:39:49Z)
Dynamics Generalization via Information Bottleneck in Deep Reinforcement Learning [90.93035276307239]
We propose an information theoretic regularization objective and an annealing-based optimization method to achieve better generalization ability in RL agents. We demonstrate the extreme generalization benefits of our approach in different domains ranging from maze navigation to robotic tasks. This work provides a principled way to improve generalization in RL by gradually removing information that is redundant for task-solving.
arXiv Detail & Related papers (2020-08-03T02:24:20Z)
Disentangling Adaptive Gradient Methods from Learning Rates [65.0397050979662]
We take a deeper look at how adaptive gradient methods interact with the learning rate schedule. We introduce a "grafting" experiment which decouples an update's magnitude from its direction. We present some empirical and theoretical retrospectives on the generalization of adaptive gradient methods.
arXiv Detail & Related papers (2020-02-26T21:42:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.