Related papers: On The Transferability of Deep-Q Networks

On The Transferability of Deep-Q Networks

URL: http://arxiv.org/abs/2110.02639v1
Date: Wed, 6 Oct 2021 10:29:37 GMT
Title: On The Transferability of Deep-Q Networks
Authors: Matthia Sabatelli, Pierre Geurts
Abstract summary: Transfer Learning is an efficient machine learning paradigm that allows overcoming some of the hurdles that characterize the successful training of deep neural networks. While exploiting TL is a well established and successful training practice in Supervised Learning (SL), its applicability in Deep Reinforcement Learning (DRL) is rarer. In this paper, we study the level of transferability of three different variants of Deep-Q Networks on popular DRL benchmarks and on a set of novel, carefully designed control tasks.
Score: 6.822707222147354
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Transfer Learning (TL) is an efficient machine learning paradigm that allows overcoming some of the hurdles that characterize the successful training of deep neural networks, ranging from long training times to the needs of large datasets. While exploiting TL is a well established and successful training practice in Supervised Learning (SL), its applicability in Deep Reinforcement Learning (DRL) is rarer. In this paper, we study the level of transferability of three different variants of Deep-Q Networks on popular DRL benchmarks as well as on a set of novel, carefully designed control tasks. Our results show that transferring neural networks in a DRL context can be particularly challenging and is a process which in most cases results in negative transfer. In the attempt of understanding why Deep-Q Networks transfer so poorly, we gain novel insights into the training dynamics that characterizes this family of algorithms.

Related papers

Improving Neural Network Training using Dynamic Learning Rate Schedule for PINNs and Image Classification [0.0]
This paper presents a dynamic learning rate scheduler (DLRS) algorithm that adapts the learning rate based on the loss values calculated during the training process.<n> Experiments are conducted on problems related to physics-informed neural networks (PINNs) and image classification using multilayer perceptrons and convolutional neural networks, respectively.
arXiv Detail & Related papers (2025-07-29T12:31:21Z)
Network Sparsity Unlocks the Scaling Potential of Deep Reinforcement Learning [57.3885832382455]
We show that introducing static network sparsity alone can unlock further scaling potential beyond dense counterparts with state-of-the-art architectures.<n>Our analysis reveals that, in contrast to naively scaling up dense DRL networks, such sparse networks achieve both higher parameter efficiency for network expressivity.
arXiv Detail & Related papers (2025-06-20T17:54:24Z)
Achieving Network Resilience through Graph Neural Network-enabled Deep Reinforcement Learning [64.20847540439318]
Deep reinforcement learning (DRL) has been widely used in many important tasks of communication networks. Some studies have combined graph neural networks (GNNs) with DRL, which use the GNNs to extract unstructured features of the network. This paper explores the solution of combining GNNs with DRL to build a resilient network.
arXiv Detail & Related papers (2025-01-19T15:22:17Z)
Deep Transfer $Q$-Learning for Offline Non-Stationary Reinforcement Learning [3.2839905453386162]
This paper pioneers the study of transfer learning for dynamic decision scenarios modeled by non-stationary finite-horizon Markov decision processes. We introduce a novel re-weighted targeting procedure'' to construct transferable RL samples'' and propose transfer deep $Q*$-learning'' Our analytical techniques for transfer learning in neural network approximation and transition density transfers have broader implications.
arXiv Detail & Related papers (2025-01-08T23:03:18Z)
Diffusion-Based Neural Network Weights Generation [80.89706112736353]
D2NWG is a diffusion-based neural network weights generation technique that efficiently produces high-performing weights for transfer learning. Our method extends generative hyper-representation learning to recast the latent diffusion paradigm for neural network weights generation. Our approach is scalable to large architectures such as large language models (LLMs), overcoming the limitations of current parameter generation techniques.
arXiv Detail & Related papers (2024-02-28T08:34:23Z)
Deep Fusion: Efficient Network Training via Pre-trained Initializations [3.9146761527401424]
We present Deep Fusion, an efficient approach to network training that leverages pre-trained initializations of smaller networks. Our experiments show how Deep Fusion is a practical and effective approach that not only accelerates the training process but also reduces computational requirements. We validate our theoretical framework, which guides the optimal use of Deep Fusion, showing that it significantly reduces both training time and resource consumption.
arXiv Detail & Related papers (2023-06-20T21:30:54Z)
Solving Large-scale Spatial Problems with Convolutional Neural Networks [88.31876586547848]
We employ transfer learning to improve training efficiency for large-scale spatial problems. We propose that a convolutional neural network (CNN) can be trained on small windows of signals, but evaluated on arbitrarily large signals with little to no performance degradation.
arXiv Detail & Related papers (2023-06-14T01:24:42Z)
Provable Guarantees for Nonlinear Feature Learning in Three-Layer Neural Networks [49.808194368781095]
We show that three-layer neural networks have provably richer feature learning capabilities than two-layer networks. This work makes progress towards understanding the provable benefit of three-layer neural networks over two-layer networks in the feature learning regime.
arXiv Detail & Related papers (2023-05-11T17:19:30Z)
The State of Sparse Training in Deep Reinforcement Learning [23.034856834801346]
The use of sparse neural networks has seen rapid growth in recent years, particularly in computer vision. Their appeal stems largely from the reduced number of parameters required to train and store, as well as an increase in learning efficiency. We perform a systematic investigation into applying a number of existing sparse training techniques on a variety of Deep Reinforcement Learning agents and environments.
arXiv Detail & Related papers (2022-06-17T14:08:00Z)
Uncertainty Quantification and Resource-Demanding Computer Vision Applications of Deep Learning [5.130440339897478]
Bringing deep neural networks (DNNs) into safety critical applications requires a thorough treatment of the model's uncertainties. In this article, we survey methods that we developed to teach DNNs to be uncertain when they encounter new object classes. We also present training methods to learn from only a few labels with help of uncertainty quantification.
arXiv Detail & Related papers (2022-05-30T08:31:03Z)
Deep Reinforcement Learning with Spiking Q-learning [51.386945803485084]
spiking neural networks (SNNs) are expected to realize artificial intelligence (AI) with less energy consumption. It provides a promising energy-efficient way for realistic control tasks by combining SNNs with deep reinforcement learning (RL)
arXiv Detail & Related papers (2022-01-21T16:42:11Z)
Provable Regret Bounds for Deep Online Learning and Control [77.77295247296041]
We show that any loss functions can be adapted to optimize the parameters of a neural network such that it competes with the best net in hindsight. As an application of these results in the online setting, we obtain provable bounds for online control controllers.
arXiv Detail & Related papers (2021-10-15T02:13:48Z)
Training Larger Networks for Deep Reinforcement Learning [18.193180866998333]
We show that naively increasing network capacity does not improve performance. We propose a novel method that consists of 1) wider networks with DenseNet connection, 2) decoupling representation learning from training of RL, and 3) a distributed training method to mitigate overfitting problems. Using this three-fold technique, we show that we can train very large networks that result in significant performance gains.
arXiv Detail & Related papers (2021-02-16T02:16:54Z)
Deep Transfer Learning with Ridge Regression [7.843067454030999]
Deep models trained with massive amounts of data demonstrate promising generalisation ability on unseen data from relevant domains. We address this issue by leveraging the low-rank property of learnt feature vectors produced from deep neural networks (DNNs) with the closed-form solution provided in kernel ridge regression (KRR) Our method is successful on supervised and semi-supervised transfer learning tasks.
arXiv Detail & Related papers (2020-06-11T20:21:35Z)
Large-Scale Gradient-Free Deep Learning with Recursive Local Representation Alignment [84.57874289554839]
Training deep neural networks on large-scale datasets requires significant hardware resources. Backpropagation, the workhorse for training these networks, is an inherently sequential process that is difficult to parallelize. We propose a neuro-biologically-plausible alternative to backprop that can be used to train deep networks.
arXiv Detail & Related papers (2020-02-10T16:20:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.