Spectral Normalisation for Deep Reinforcement Learning: an Optimisation
Perspective
- URL: http://arxiv.org/abs/2105.05246v1
- Date: Tue, 11 May 2021 17:59:46 GMT
- Title: Spectral Normalisation for Deep Reinforcement Learning: an Optimisation
Perspective
- Authors: Florin Gogianu and Tudor Berariu, Mihaela Rosca, Claudia Clopath,
Lucian Busoniu, Razvan Pascanu
- Abstract summary: We show we can recover the performance of developments not by changing the objective, but by regularising the value-function estimator.
We conduct ablation studies to disentangle the various effects normalisation has on the learning dynamics.
These findings hint towards the need to also focus on the neural component and its learning dynamics to tackle the peculiarities of Deep Reinforcement Learning.
- Score: 22.625456135981292
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Most of the recent deep reinforcement learning advances take an RL-centric
perspective and focus on refinements of the training objective. We diverge from
this view and show we can recover the performance of these developments not by
changing the objective, but by regularising the value-function estimator.
Constraining the Lipschitz constant of a single layer using spectral
normalisation is sufficient to elevate the performance of a Categorical-DQN
agent to that of a more elaborated \rainbow{} agent on the challenging Atari
domain. We conduct ablation studies to disentangle the various effects
normalisation has on the learning dynamics and show that is sufficient to
modulate the parameter updates to recover most of the performance of spectral
normalisation. These findings hint towards the need to also focus on the neural
component and its learning dynamics to tackle the peculiarities of Deep
Reinforcement Learning.
Related papers
- PEARL: Preconditioner Enhancement through Actor-critic Reinforcement Learning [5.433548785820674]
We present PEARL (Preconditioner Enhancement through Actor-critic Reinforcement Learning), a novel approach to learning matrix preconditioners.
Recent advances have explored using deep neural networks to learn preconditioners, though challenges such as misbehaved objective functions and costly training procedures remain.
arXiv Detail & Related papers (2025-01-18T12:19:18Z) - Point-Calibrated Spectral Neural Operators [54.13671100638092]
We introduce Point-Calibrated Spectral Transform, which learns operator mappings by approximating functions with the point-level adaptive spectral basis.
Point-Calibrated Spectral Neural Operators learn operator mappings by approximating functions with the point-level adaptive spectral basis.
arXiv Detail & Related papers (2024-10-15T08:19:39Z) - Normalization and effective learning rates in reinforcement learning [52.59508428613934]
Normalization layers have recently experienced a renaissance in the deep reinforcement learning and continual learning literature.
We show that normalization brings with it a subtle but important side effect: an equivalence between growth in the norm of the network parameters and decay in the effective learning rate.
We propose to make the learning rate schedule explicit with a simple re- parameterization which we call Normalize-and-Project.
arXiv Detail & Related papers (2024-07-01T20:58:01Z) - Learning Continually by Spectral Regularization [45.55508032009977]
Continual learning algorithms seek to mitigate loss of plasticity by sustaining good performance while maintaining network trainability.
We develop a new technique for improving continual learning inspired by the observation that the singular values of the neural network parameters at initialization are an important factor for trainability during early phases of learning.
We present an experimental analysis that shows how the proposed spectral regularizer can sustain trainability and performance across a range of model architectures in continual supervised and reinforcement learning settings.
arXiv Detail & Related papers (2024-06-10T21:34:43Z) - Rich-Observation Reinforcement Learning with Continuous Latent Dynamics [43.84391209459658]
We introduce a new theoretical framework, RichCLD (Rich-Observation RL with Continuous Latent Dynamics), in which the agent performs control based on high-dimensional observations.
Our main contribution is a new algorithm for this setting that is provably statistically and computationally efficient.
arXiv Detail & Related papers (2024-05-29T17:02:49Z) - Self-STORM: Deep Unrolled Self-Supervised Learning for Super-Resolution Microscopy [55.2480439325792]
We introduce deep unrolled self-supervised learning, which alleviates the need for such data by training a sequence-specific, model-based autoencoder.
Our proposed method exceeds the performance of its supervised counterparts.
arXiv Detail & Related papers (2024-03-25T17:40:32Z) - A Model-Based Approach for Improving Reinforcement Learning Efficiency
Leveraging Expert Observations [9.240917262195046]
We propose an algorithm that automatically adjusts the weights of each component in the augmented loss function.
Experiments on a variety of continuous control tasks demonstrate that the proposed algorithm outperforms various benchmarks.
arXiv Detail & Related papers (2024-02-29T03:53:02Z) - Learning Dynamics and Generalization in Reinforcement Learning [59.530058000689884]
We show theoretically that temporal difference learning encourages agents to fit non-smooth components of the value function early in training.
We show that neural networks trained using temporal difference algorithms on dense reward tasks exhibit weaker generalization between states than randomly networks and gradient networks trained with policy methods.
arXiv Detail & Related papers (2022-06-05T08:49:16Z) - Neurally Augmented ALISTA [15.021419552695066]
We introduce Neurally Augmented ALISTA, in which an LSTM network is used to compute step sizes and thresholds individually for each target vector during reconstruction.
We show that our approach further improves empirical performance in sparse reconstruction, in particular outperforming existing algorithms by an increasing margin as the compression ratio becomes more challenging.
arXiv Detail & Related papers (2020-10-05T11:39:49Z) - Dynamics Generalization via Information Bottleneck in Deep Reinforcement
Learning [90.93035276307239]
We propose an information theoretic regularization objective and an annealing-based optimization method to achieve better generalization ability in RL agents.
We demonstrate the extreme generalization benefits of our approach in different domains ranging from maze navigation to robotic tasks.
This work provides a principled way to improve generalization in RL by gradually removing information that is redundant for task-solving.
arXiv Detail & Related papers (2020-08-03T02:24:20Z) - Disentangling Adaptive Gradient Methods from Learning Rates [65.0397050979662]
We take a deeper look at how adaptive gradient methods interact with the learning rate schedule.
We introduce a "grafting" experiment which decouples an update's magnitude from its direction.
We present some empirical and theoretical retrospectives on the generalization of adaptive gradient methods.
arXiv Detail & Related papers (2020-02-26T21:42:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.