Cooperative Deep $Q$-learning Framework for Environments Providing Image
Feedback
- URL: http://arxiv.org/abs/2110.15305v1
- Date: Thu, 28 Oct 2021 17:12:41 GMT
- Title: Cooperative Deep $Q$-learning Framework for Environments Providing Image
Feedback
- Authors: Krishnan Raghavan and Vignesh Narayanan and Jagannathan Sarangapani
- Abstract summary: We address two key challenges in deep reinforcement learning setting, sample inefficiency and slow learning, with a dual NN-driven learning approach.
In particular, we develop a temporal difference (TD) error-driven learning approach, where we introduce a set of linear transformations of the TD error to directly update the parameters of each layer in the deep NN.
We show that the proposed methods enables faster learning and convergence and requires reduced buffer size.
- Score: 5.607676459156789
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: In this paper, we address two key challenges in deep reinforcement learning
setting, sample inefficiency and slow learning, with a dual NN-driven learning
approach. In the proposed approach, we use two deep NNs with independent
initialization to robustly approximate the action-value function in the
presence of image inputs. In particular, we develop a temporal difference (TD)
error-driven learning approach, where we introduce a set of linear
transformations of the TD error to directly update the parameters of each layer
in the deep NN. We demonstrate theoretically that the cost minimized by the
error-driven learning (EDL) regime is an approximation of the empirical cost
and the approximation error reduces as learning progresses, irrespective of the
size of the network. Using simulation analysis, we show that the proposed
methods enables faster learning and convergence and requires reduced buffer
size (thereby increasing the sample efficiency).
Related papers
- Revisiting Disparity from Dual-Pixel Images: Physics-Informed Lightweight Depth Estimation [3.6337378417255177]
We propose a lightweight disparity estimation method based on a completion-based network.
By modeling the DP-specific disparity error parametrically and using it for sampling during training, the network acquires the unique properties of DP.
As a result, the proposed method achieved state-of-the-art results while reducing the overall system size to 1/5 of that of the conventional method.
arXiv Detail & Related papers (2024-11-06T09:03:53Z) - Adaptive Anomaly Detection in Network Flows with Low-Rank Tensor Decompositions and Deep Unrolling [9.20186865054847]
Anomaly detection (AD) is increasingly recognized as a key component for ensuring the resilience of future communication systems.
This work considers AD in network flows using incomplete measurements.
We propose a novel block-successive convex approximation algorithm based on a regularized model-fitting objective.
Inspired by Bayesian approaches, we extend the model architecture to perform online adaptation to per-flow and per-time-step statistics.
arXiv Detail & Related papers (2024-09-17T19:59:57Z) - Convergence Visualizer of Decentralized Federated Distillation with
Reduced Communication Costs [3.2098126952615442]
Federated learning (FL) achieves collaborative learning without the need for data sharing, thus preventing privacy leakage.
This study solves two unresolved challenges of CMFD: (1) communication cost reduction and (2) visualization of model convergence.
arXiv Detail & Related papers (2023-12-19T07:23:49Z) - Faster Adaptive Federated Learning [84.38913517122619]
Federated learning has attracted increasing attention with the emergence of distributed data.
In this paper, we propose an efficient adaptive algorithm (i.e., FAFED) based on momentum-based variance reduced technique in cross-silo FL.
arXiv Detail & Related papers (2022-12-02T05:07:50Z) - Less is More: Rethinking Few-Shot Learning and Recurrent Neural Nets [2.824895388993495]
We provide theoretical guarantees for reliable learning under the information-theoretic AEP.
We then focus on a highly efficient recurrent neural net (RNN) framework and propose a reduced-entropy algorithm for few-shot learning.
Our experimental results demonstrate significant potential for improving learning models' sample efficiency, generalization, and time complexity.
arXiv Detail & Related papers (2022-09-28T17:33:11Z) - Solving Sparse Linear Inverse Problems in Communication Systems: A Deep
Learning Approach With Adaptive Depth [51.40441097625201]
We propose an end-to-end trainable deep learning architecture for sparse signal recovery problems.
The proposed method learns how many layers to execute to emit an output, and the network depth is dynamically adjusted for each task in the inference phase.
arXiv Detail & Related papers (2020-10-29T06:32:53Z) - Adaptive Gradient Method with Resilience and Momentum [120.83046824742455]
We propose an Adaptive Gradient Method with Resilience and Momentum (AdaRem)
AdaRem adjusts the parameter-wise learning rate according to whether the direction of one parameter changes in the past is aligned with the direction of the current gradient.
Our method outperforms previous adaptive learning rate-based algorithms in terms of the training speed and the test error.
arXiv Detail & Related papers (2020-10-21T14:49:00Z) - An Ode to an ODE [78.97367880223254]
We present a new paradigm for Neural ODE algorithms, called ODEtoODE, where time-dependent parameters of the main flow evolve according to a matrix flow on the group O(d)
This nested system of two flows provides stability and effectiveness of training and provably solves the gradient vanishing-explosion problem.
arXiv Detail & Related papers (2020-06-19T22:05:19Z) - Communication-Efficient Distributed Stochastic AUC Maximization with
Deep Neural Networks [50.42141893913188]
We study a distributed variable for large-scale AUC for a neural network as with a deep neural network.
Our model requires a much less number of communication rounds and still a number of communication rounds in theory.
Our experiments on several datasets show the effectiveness of our theory and also confirm our theory.
arXiv Detail & Related papers (2020-05-05T18:08:23Z) - Belief Propagation Reloaded: Learning BP-Layers for Labeling Problems [83.98774574197613]
We take one of the simplest inference methods, a truncated max-product Belief propagation, and add what is necessary to make it a proper component of a deep learning model.
This BP-Layer can be used as the final or an intermediate block in convolutional neural networks (CNNs)
The model is applicable to a range of dense prediction problems, is well-trainable and provides parameter-efficient and robust solutions in stereo, optical flow and semantic segmentation.
arXiv Detail & Related papers (2020-03-13T13:11:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.