Related papers: Consolidated Adaptive T-soft Update for Deep Reinforcement Learning

Consolidated Adaptive T-soft Update for Deep Reinforcement Learning

URL: http://arxiv.org/abs/2202.12504v1
Date: Fri, 25 Feb 2022 05:40:07 GMT
Title: Consolidated Adaptive T-soft Update for Deep Reinforcement Learning
Authors: Taisuke Kobayashi
Abstract summary: T-soft update has been proposed as a noise-robust update rule for the target network. This study develops adaptive T-soft (AT-soft) update by utilizing the update rule in AdaTerm.
Score: 8.071506311915396
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Demand for deep reinforcement learning (DRL) is gradually increased to enable robots to perform complex tasks, while DRL is known to be unstable. As a technique to stabilize its learning, a target network that slowly and asymptotically matches a main network is widely employed to generate stable pseudo-supervised signals. Recently, T-soft update has been proposed as a noise-robust update rule for the target network and has contributed to improving the DRL performance. However, the noise robustness of T-soft update is specified by a hyperparameter, which should be tuned for each task, and is deteriorated by a simplified implementation. This study develops adaptive T-soft (AT-soft) update by utilizing the update rule in AdaTerm, which has been developed recently. In addition, the concern that the target network does not asymptotically match the main network is mitigated by a new consolidation for bringing the main network back to the target network. This so-called consolidated AT-soft (CAT-soft) update is verified through numerical simulations.

Related papers

Network Sparsity Unlocks the Scaling Potential of Deep Reinforcement Learning [57.3885832382455]
We show that introducing static network sparsity alone can unlock further scaling potential beyond dense counterparts with state-of-the-art architectures.<n>Our analysis reveals that, in contrast to naively scaling up dense DRL networks, such sparse networks achieve both higher parameter efficiency for network expressivity.
arXiv Detail & Related papers (2025-06-20T17:54:24Z)
AI-Driven Dynamic Firewall Optimization Using Reinforcement Learning for Anomaly Detection and Prevention [0.0]
This paper proposes a novel AI-driven dynamic firewall optimization framework.<n>It autonomously adapts and updates firewall rules in response to evolving network threats.<n>Results demonstrate significant improvements in detection accuracy, false positive reduction, and rule update latency.
arXiv Detail & Related papers (2025-05-21T17:05:33Z)
Stabilizing RNN Gradients through Pre-training [3.335932527835653]
Theory of learning proposes to prevent the gradient from exponential growth with depth or time, to stabilize and improve training. We extend known stability theories to encompass a broader family of deep recurrent networks, requiring minimal assumptions on data and parameter distribution. We propose a new approach to mitigate this issue, that consists on giving a weight of a half to the time and depth contributions to the gradient.
arXiv Detail & Related papers (2023-08-23T11:48:35Z)
Multiplicative update rules for accelerating deep learning training and increasing robustness [69.90473612073767]
We propose an optimization framework that fits to a wide range of machine learning algorithms and enables one to apply alternative update rules. We claim that the proposed framework accelerates training, while leading to more robust models in contrast to traditionally used additive update rule.
arXiv Detail & Related papers (2023-07-14T06:44:43Z)
Quantization-aware Interval Bound Propagation for Training Certifiably Robust Quantized Neural Networks [58.195261590442406]
We study the problem of training and certifying adversarially robust quantized neural networks (QNNs) Recent work has shown that floating-point neural networks that have been verified to be robust can become vulnerable to adversarial attacks after quantization. We present quantization-aware interval bound propagation (QA-IBP), a novel method for training robust QNNs.
arXiv Detail & Related papers (2022-11-29T13:32:38Z)
Learning in Feedback-driven Recurrent Spiking Neural Networks using full-FORCE Training [4.124948554183487]
We propose a supervised training procedure for RSNNs, where a second network is introduced only during the training. The proposed training procedure consists of generating targets for both recurrent and readout layers. We demonstrate the improved performance and noise robustness of the proposed full-FORCE training procedure to model 8 dynamical systems.
arXiv Detail & Related papers (2022-05-26T19:01:19Z)
Learning Fast and Slow for Online Time Series Forecasting [76.50127663309604]
Fast and Slow learning Networks (FSNet) is a holistic framework for online time-series forecasting. FSNet balances fast adaptation to recent changes and retrieving similar old knowledge. Our code will be made publicly available.
arXiv Detail & Related papers (2022-02-23T18:23:07Z)
Ensemble-in-One: Learning Ensemble within Random Gated Networks for Enhanced Adversarial Robustness [18.514706498043214]
Adversarial attacks have rendered high security risks on modern deep learning systems. We propose ensemble-in-one (EIO) to train an ensemble within one random gated network (RGN) EIO consistently outperforms previous ensemble training methods with even less computational overhead.
arXiv Detail & Related papers (2021-03-27T03:13:03Z)
t-Soft Update of Target Network for Deep Reinforcement Learning [8.071506311915396]
This paper proposes a new robust update rule of target network for deep reinforcement learning (DRL) A t-soft update method is derived with reference to the analogy between the exponential moving average and the normal distribution. In PyBullet robotics simulations for DRL, an online actor-critic algorithm with the t-soft update outperformed the conventional methods in terms of the obtained return and/or its variance.
arXiv Detail & Related papers (2020-08-25T07:41:47Z)
Improve Generalization and Robustness of Neural Networks via Weight Scale Shifting Invariant Regularizations [52.493315075385325]
We show that a family of regularizers, including weight decay, is ineffective at penalizing the intrinsic norms of weights for networks with homogeneous activation functions. We propose an improved regularizer that is invariant to weight scale shifting and thus effectively constrains the intrinsic norm of a neural network.
arXiv Detail & Related papers (2020-08-07T02:55:28Z)
Progressive Tandem Learning for Pattern Recognition with Deep Spiking Neural Networks [80.15411508088522]
Spiking neural networks (SNNs) have shown advantages over traditional artificial neural networks (ANNs) for low latency and high computational efficiency. We propose a novel ANN-to-SNN conversion and layer-wise learning framework for rapid and efficient pattern recognition.
arXiv Detail & Related papers (2020-07-02T15:38:44Z)
Rapid Structural Pruning of Neural Networks with Set-based Task-Adaptive Meta-Pruning [83.59005356327103]
A common limitation of most existing pruning techniques is that they require pre-training of the network at least once before pruning. We propose STAMP, which task-adaptively prunes a network pretrained on a large reference dataset by generating a pruning mask on it as a function of the target dataset. We validate STAMP against recent advanced pruning methods on benchmark datasets.
arXiv Detail & Related papers (2020-06-22T10:57:43Z)
STDPG: A Spatio-Temporal Deterministic Policy Gradient Agent for Dynamic Routing in SDN [6.27420060051673]
Dynamic routing in software-defined networking (SDN) can be viewed as a centralized decision-making problem. We propose a novel model-free framework for dynamic routing in SDN, which is referred to as SDN-temporal deterministic policy gradient (STDPG) agent. STDPG achieves better routing solutions in terms of average end-to-end delay.
arXiv Detail & Related papers (2020-04-21T07:19:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.