Related papers: Training Larger Networks for Deep Reinforcement Learning

Training Larger Networks for Deep Reinforcement Learning

URL: http://arxiv.org/abs/2102.07920v1
Date: Tue, 16 Feb 2021 02:16:54 GMT
Title: Training Larger Networks for Deep Reinforcement Learning
Authors: Kei Ota, Devesh K. Jha, Asako Kanezaki
Abstract summary: We show that naively increasing network capacity does not improve performance. We propose a novel method that consists of 1) wider networks with DenseNet connection, 2) decoupling representation learning from training of RL, and 3) a distributed training method to mitigate overfitting problems. Using this three-fold technique, we show that we can train very large networks that result in significant performance gains.
Score: 18.193180866998333
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: The success of deep learning in the computer vision and natural language processing communities can be attributed to training of very deep neural networks with millions or billions of parameters which can then be trained with massive amounts of data. However, similar trend has largely eluded training of deep reinforcement learning (RL) algorithms where larger networks do not lead to performance improvement. Previous work has shown that this is mostly due to instability during training of deep RL agents when using larger networks. In this paper, we make an attempt to understand and address training of larger networks for deep RL. We first show that naively increasing network capacity does not improve performance. Then, we propose a novel method that consists of 1) wider networks with DenseNet connection, 2) decoupling representation learning from training of RL, 3) a distributed training method to mitigate overfitting problems. Using this three-fold technique, we show that we can train very large networks that result in significant performance gains. We present several ablation studies to demonstrate the efficacy of the proposed method and some intuitive understanding of the reasons for performance gain. We show that our proposed method outperforms other baseline algorithms on several challenging locomotion tasks.

Related papers

Online Training and Pruning of Deep Reinforcement Learning Networks [0.0]
Scaling deep neural networks (NN) of reinforcement learning (RL) algorithms has been shown to enhance performance when feature extraction networks are used.<n>We propose an approach to integrate simultaneous training and pruning within advanced RL methods.
arXiv Detail & Related papers (2025-07-16T07:17:41Z)
Network Sparsity Unlocks the Scaling Potential of Deep Reinforcement Learning [57.3885832382455]
We show that introducing static network sparsity alone can unlock further scaling potential beyond dense counterparts with state-of-the-art architectures.<n>Our analysis reveals that, in contrast to naively scaling up dense DRL networks, such sparse networks achieve both higher parameter efficiency for network expressivity.
arXiv Detail & Related papers (2025-06-20T17:54:24Z)
Deep Fusion: Efficient Network Training via Pre-trained Initializations [3.9146761527401424]
We present Deep Fusion, an efficient approach to network training that leverages pre-trained initializations of smaller networks. Our experiments show how Deep Fusion is a practical and effective approach that not only accelerates the training process but also reduces computational requirements. We validate our theoretical framework, which guides the optimal use of Deep Fusion, showing that it significantly reduces both training time and resource consumption.
arXiv Detail & Related papers (2023-06-20T21:30:54Z)
Solving Large-scale Spatial Problems with Convolutional Neural Networks [88.31876586547848]
We employ transfer learning to improve training efficiency for large-scale spatial problems. We propose that a convolutional neural network (CNN) can be trained on small windows of signals, but evaluated on arbitrarily large signals with little to no performance degradation.
arXiv Detail & Related papers (2023-06-14T01:24:42Z)
Single-Shot Pruning for Offline Reinforcement Learning [47.886329599997474]
Deep Reinforcement Learning (RL) is a powerful framework for solving complex real-world problems. One way to tackle this problem is to prune neural networks leaving only the necessary parameters. We close the gap between RL and single-shot pruning techniques and present a general pruning approach to the Offline RL.
arXiv Detail & Related papers (2021-12-31T18:10:02Z)
Recursive Least-Squares Estimator-Aided Online Learning for Visual Tracking [58.14267480293575]
We propose a simple yet effective online learning approach for few-shot online adaptation without requiring offline training. It allows an in-built memory retention mechanism for the model to remember the knowledge about the object seen before. We evaluate our approach based on two networks in the online learning families for tracking, i.e., multi-layer perceptrons in RT-MDNet and convolutional neural networks in DiMP.
arXiv Detail & Related papers (2021-12-28T06:51:18Z)
On The Transferability of Deep-Q Networks [6.822707222147354]
Transfer Learning is an efficient machine learning paradigm that allows overcoming some of the hurdles that characterize the successful training of deep neural networks. While exploiting TL is a well established and successful training practice in Supervised Learning (SL), its applicability in Deep Reinforcement Learning (DRL) is rarer. In this paper, we study the level of transferability of three different variants of Deep-Q Networks on popular DRL benchmarks and on a set of novel, carefully designed control tasks.
arXiv Detail & Related papers (2021-10-06T10:29:37Z)
Dynamic Sparse Training for Deep Reinforcement Learning [36.66889208433228]
We propose for the first time to dynamically train deep reinforcement learning agents with sparse neural networks from scratch. Our approach is easy to be integrated into existing deep reinforcement learning algorithms. We evaluate our approach on OpenAI gym continuous control tasks.
arXiv Detail & Related papers (2021-06-08T09:57:20Z)
Sparsity in Deep Learning: Pruning and growth for efficient inference and training in neural networks [78.47459801017959]
Sparsity can reduce the memory footprint of regular networks to fit mobile devices. We describe approaches to remove and add elements of neural networks, different training strategies to achieve model sparsity, and mechanisms to exploit sparsity in practice.
arXiv Detail & Related papers (2021-01-31T22:48:50Z)
Go Wide, Then Narrow: Efficient Training of Deep Thin Networks [62.26044348366186]
We propose an efficient method to train a deep thin network with a theoretic guarantee. By training with our method, ResNet50 can outperform ResNet101, and BERT Base can be comparable with BERT Large.
arXiv Detail & Related papers (2020-07-01T23:34:35Z)
Can Increasing Input Dimensionality Improve Deep Reinforcement Learning? [15.578423102700764]
We propose an online feature extractor network (OFENet) that uses neural nets to produce good representations to be used as inputs to deep RL algorithms. We show that the RL agents learn more efficiently with the high-dimensional representation than with the lower-dimensional state observations.
arXiv Detail & Related papers (2020-03-03T16:52:05Z)
Large-Scale Gradient-Free Deep Learning with Recursive Local Representation Alignment [84.57874289554839]
Training deep neural networks on large-scale datasets requires significant hardware resources. Backpropagation, the workhorse for training these networks, is an inherently sequential process that is difficult to parallelize. We propose a neuro-biologically-plausible alternative to backprop that can be used to train deep networks.
arXiv Detail & Related papers (2020-02-10T16:20:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.