Cooperative Deep $Q$-learning Framework for Environments Providing Image
Feedback
- URL: http://arxiv.org/abs/2110.15305v1
- Date: Thu, 28 Oct 2021 17:12:41 GMT
- Title: Cooperative Deep $Q$-learning Framework for Environments Providing Image
Feedback
- Authors: Krishnan Raghavan and Vignesh Narayanan and Jagannathan Sarangapani
- Abstract summary: We address two key challenges in deep reinforcement learning setting, sample inefficiency and slow learning, with a dual NN-driven learning approach.
In particular, we develop a temporal difference (TD) error-driven learning approach, where we introduce a set of linear transformations of the TD error to directly update the parameters of each layer in the deep NN.
We show that the proposed methods enables faster learning and convergence and requires reduced buffer size.
- Score: 5.607676459156789
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: In this paper, we address two key challenges in deep reinforcement learning
setting, sample inefficiency and slow learning, with a dual NN-driven learning
approach. In the proposed approach, we use two deep NNs with independent
initialization to robustly approximate the action-value function in the
presence of image inputs. In particular, we develop a temporal difference (TD)
error-driven learning approach, where we introduce a set of linear
transformations of the TD error to directly update the parameters of each layer
in the deep NN. We demonstrate theoretically that the cost minimized by the
error-driven learning (EDL) regime is an approximation of the empirical cost
and the approximation error reduces as learning progresses, irrespective of the
size of the network. Using simulation analysis, we show that the proposed
methods enables faster learning and convergence and requires reduced buffer
size (thereby increasing the sample efficiency).
Related papers
- Contrastive-Adversarial and Diffusion: Exploring pre-training and fine-tuning strategies for sulcal identification [3.0398616939692777]
Techniques like adversarial learning, contrastive learning, diffusion denoising learning, and ordinary reconstruction learning have become standard.
The study aims to elucidate the advantages of pre-training techniques and fine-tuning strategies to enhance the learning process of neural networks.
arXiv Detail & Related papers (2024-05-29T15:44:51Z) - QGait: Toward Accurate Quantization for Gait Recognition with Binarized Input [17.017127559393398]
We propose a differentiable soft quantizer, which better simulates the gradient of the round function during backpropagation.
This enables the network to learn from subtle input perturbations.
We further refine the training strategy to ensure convergence while simulating quantization errors.
arXiv Detail & Related papers (2024-05-22T17:34:18Z) - Convergence Visualizer of Decentralized Federated Distillation with
Reduced Communication Costs [3.2098126952615442]
Federated learning (FL) achieves collaborative learning without the need for data sharing, thus preventing privacy leakage.
This study solves two unresolved challenges of CMFD: (1) communication cost reduction and (2) visualization of model convergence.
arXiv Detail & Related papers (2023-12-19T07:23:49Z) - Faster Adaptive Federated Learning [84.38913517122619]
Federated learning has attracted increasing attention with the emergence of distributed data.
In this paper, we propose an efficient adaptive algorithm (i.e., FAFED) based on momentum-based variance reduced technique in cross-silo FL.
arXiv Detail & Related papers (2022-12-02T05:07:50Z) - Less is More: Rethinking Few-Shot Learning and Recurrent Neural Nets [2.824895388993495]
We provide theoretical guarantees for reliable learning under the information-theoretic AEP.
We then focus on a highly efficient recurrent neural net (RNN) framework and propose a reduced-entropy algorithm for few-shot learning.
Our experimental results demonstrate significant potential for improving learning models' sample efficiency, generalization, and time complexity.
arXiv Detail & Related papers (2022-09-28T17:33:11Z) - Solving Sparse Linear Inverse Problems in Communication Systems: A Deep
Learning Approach With Adaptive Depth [51.40441097625201]
We propose an end-to-end trainable deep learning architecture for sparse signal recovery problems.
The proposed method learns how many layers to execute to emit an output, and the network depth is dynamically adjusted for each task in the inference phase.
arXiv Detail & Related papers (2020-10-29T06:32:53Z) - Adaptive Gradient Method with Resilience and Momentum [120.83046824742455]
We propose an Adaptive Gradient Method with Resilience and Momentum (AdaRem)
AdaRem adjusts the parameter-wise learning rate according to whether the direction of one parameter changes in the past is aligned with the direction of the current gradient.
Our method outperforms previous adaptive learning rate-based algorithms in terms of the training speed and the test error.
arXiv Detail & Related papers (2020-10-21T14:49:00Z) - An Ode to an ODE [78.97367880223254]
We present a new paradigm for Neural ODE algorithms, called ODEtoODE, where time-dependent parameters of the main flow evolve according to a matrix flow on the group O(d)
This nested system of two flows provides stability and effectiveness of training and provably solves the gradient vanishing-explosion problem.
arXiv Detail & Related papers (2020-06-19T22:05:19Z) - Communication-Efficient Distributed Stochastic AUC Maximization with
Deep Neural Networks [50.42141893913188]
We study a distributed variable for large-scale AUC for a neural network as with a deep neural network.
Our model requires a much less number of communication rounds and still a number of communication rounds in theory.
Our experiments on several datasets show the effectiveness of our theory and also confirm our theory.
arXiv Detail & Related papers (2020-05-05T18:08:23Z) - Belief Propagation Reloaded: Learning BP-Layers for Labeling Problems [83.98774574197613]
We take one of the simplest inference methods, a truncated max-product Belief propagation, and add what is necessary to make it a proper component of a deep learning model.
This BP-Layer can be used as the final or an intermediate block in convolutional neural networks (CNNs)
The model is applicable to a range of dense prediction problems, is well-trainable and provides parameter-efficient and robust solutions in stereo, optical flow and semantic segmentation.
arXiv Detail & Related papers (2020-03-13T13:11:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.