The State of Sparse Training in Deep Reinforcement Learning
- URL: http://arxiv.org/abs/2206.10369v1
- Date: Fri, 17 Jun 2022 14:08:00 GMT
- Title: The State of Sparse Training in Deep Reinforcement Learning
- Authors: Laura Graesser, Utku Evci, Erich Elsen, Pablo Samuel Castro
- Abstract summary: The use of sparse neural networks has seen rapid growth in recent years, particularly in computer vision.
Their appeal stems largely from the reduced number of parameters required to train and store, as well as an increase in learning efficiency.
We perform a systematic investigation into applying a number of existing sparse training techniques on a variety of Deep Reinforcement Learning agents and environments.
- Score: 23.034856834801346
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The use of sparse neural networks has seen rapid growth in recent years,
particularly in computer vision. Their appeal stems largely from the reduced
number of parameters required to train and store, as well as in an increase in
learning efficiency. Somewhat surprisingly, there have been very few efforts
exploring their use in Deep Reinforcement Learning (DRL). In this work we
perform a systematic investigation into applying a number of existing sparse
training techniques on a variety of DRL agents and environments. Our results
corroborate the findings from sparse training in the computer vision domain -
sparse networks perform better than dense networks for the same parameter count
- in the DRL domain. We provide detailed analyses on how the various components
in DRL are affected by the use of sparse networks and conclude by suggesting
promising avenues for improving the effectiveness of sparse training methods,
as well as for advancing their use in DRL.
Related papers
- Solving Large-scale Spatial Problems with Convolutional Neural Networks [88.31876586547848]
We employ transfer learning to improve training efficiency for large-scale spatial problems.
We propose that a convolutional neural network (CNN) can be trained on small windows of signals, but evaluated on arbitrarily large signals with little to no performance degradation.
arXiv Detail & Related papers (2023-06-14T01:24:42Z) - Pretraining in Deep Reinforcement Learning: A Survey [17.38360092869849]
Pretraining has shown to be effective in acquiring transferable knowledge.
Due to the nature of reinforcement learning, pretraining in this field is faced with unique challenges.
arXiv Detail & Related papers (2022-11-08T02:17:54Z) - Exploring Low Rank Training of Deep Neural Networks [49.18122605463354]
Training deep neural networks in low rank offers efficiency over unfactorised training in terms of both memory consumption and training time.
We analyse techniques that work well in practice, and through extensive ablations on models such as GPT2 we provide evidence falsifying common beliefs in the field.
arXiv Detail & Related papers (2022-09-27T17:43:45Z) - Deep Reinforcement Learning with Spiking Q-learning [51.386945803485084]
spiking neural networks (SNNs) are expected to realize artificial intelligence (AI) with less energy consumption.
It provides a promising energy-efficient way for realistic control tasks by combining SNNs with deep reinforcement learning (RL)
arXiv Detail & Related papers (2022-01-21T16:42:11Z) - Single-Shot Pruning for Offline Reinforcement Learning [47.886329599997474]
Deep Reinforcement Learning (RL) is a powerful framework for solving complex real-world problems.
One way to tackle this problem is to prune neural networks leaving only the necessary parameters.
We close the gap between RL and single-shot pruning techniques and present a general pruning approach to the Offline RL.
arXiv Detail & Related papers (2021-12-31T18:10:02Z) - On The Transferability of Deep-Q Networks [6.822707222147354]
Transfer Learning is an efficient machine learning paradigm that allows overcoming some of the hurdles that characterize the successful training of deep neural networks.
While exploiting TL is a well established and successful training practice in Supervised Learning (SL), its applicability in Deep Reinforcement Learning (DRL) is rarer.
In this paper, we study the level of transferability of three different variants of Deep-Q Networks on popular DRL benchmarks and on a set of novel, carefully designed control tasks.
arXiv Detail & Related papers (2021-10-06T10:29:37Z) - Improving Computational Efficiency in Visual Reinforcement Learning via
Stored Embeddings [89.63764845984076]
We present Stored Embeddings for Efficient Reinforcement Learning (SEER)
SEER is a simple modification of existing off-policy deep reinforcement learning methods.
We show that SEER does not degrade the performance of RLizable agents while significantly saving computation and memory.
arXiv Detail & Related papers (2021-03-04T08:14:10Z) - Training Larger Networks for Deep Reinforcement Learning [18.193180866998333]
We show that naively increasing network capacity does not improve performance.
We propose a novel method that consists of 1) wider networks with DenseNet connection, 2) decoupling representation learning from training of RL, and 3) a distributed training method to mitigate overfitting problems.
Using this three-fold technique, we show that we can train very large networks that result in significant performance gains.
arXiv Detail & Related papers (2021-02-16T02:16:54Z) - Sparsity in Deep Learning: Pruning and growth for efficient inference
and training in neural networks [78.47459801017959]
Sparsity can reduce the memory footprint of regular networks to fit mobile devices.
We describe approaches to remove and add elements of neural networks, different training strategies to achieve model sparsity, and mechanisms to exploit sparsity in practice.
arXiv Detail & Related papers (2021-01-31T22:48:50Z) - Can Increasing Input Dimensionality Improve Deep Reinforcement Learning? [15.578423102700764]
We propose an online feature extractor network (OFENet) that uses neural nets to produce good representations to be used as inputs to deep RL algorithms.
We show that the RL agents learn more efficiently with the high-dimensional representation than with the lower-dimensional state observations.
arXiv Detail & Related papers (2020-03-03T16:52:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.