A Nesterov's Accelerated quasi-Newton method for Global Routing using
Deep Reinforcement Learning
- URL: http://arxiv.org/abs/2010.09465v1
- Date: Thu, 15 Oct 2020 07:30:17 GMT
- Title: A Nesterov's Accelerated quasi-Newton method for Global Routing using
Deep Reinforcement Learning
- Authors: S. Indrapriyadarsini, Shahrzad Mahboubi, Hiroshi Ninomiya, Takeshi
Kamio, Hideki Asai
- Abstract summary: This paper attempts to accelerate the training of deep Q-networks by introducing a second order Nesterov's accelerated quasi-Newton method.
We evaluate the performance of the proposed method on deep reinforcement learning using double DQNs for global routing.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep Q-learning method is one of the most popularly used deep reinforcement
learning algorithms which uses deep neural networks to approximate the
estimation of the action-value function. Training of the deep Q-network (DQN)
is usually restricted to first order gradient based methods. This paper
attempts to accelerate the training of deep Q-networks by introducing a second
order Nesterov's accelerated quasi-Newton method. We evaluate the performance
of the proposed method on deep reinforcement learning using double DQNs for
global routing. The results show that the proposed method can obtain better
routing solutions compared to the DQNs trained with first order Adam and
RMSprop methods.
Related papers
- Application of linear regression and quasi-Newton methods to the deep reinforcement learning in continuous action cases [0.0]
The Least Squares Deep Q Network (LS-DQN) method was proposed by Levine et al.
We propose the Double Least Squares Deep Deterministic Policy Gradient (DLS-DDPG) method to address this limitation.
Numerical experiments conducted in MuJoCo environments showed that the proposed method improved performance at least in some tasks.
arXiv Detail & Related papers (2025-03-19T08:10:54Z) - Neural-Network-Driven Reward Prediction as a Heuristic: Advancing Q-Learning for Mobile Robot Path Planning [10.066546417538786]
We propose the NDR-QL method, which utilizes neural network outputs as information to accelerate the convergence process of Q-learning.
The proposed NDR-QL method improves the convergence speed of the baseline Q-learning method by 90% and also surpasses the previously improved Q-learning methods in path quality metrics.
arXiv Detail & Related papers (2024-12-17T08:19:40Z) - A lifted Bregman strategy for training unfolded proximal neural network Gaussian denoisers [8.343594411714934]
Unfolded proximal neural networks (PNNs) form a family of methods that combines deep learning and proximal optimization approaches.
We propose a lifted training formulation based on Bregman distances for unfolded PNNs.
We assess the behaviour of the proposed training approach for PNNs through numerical simulations on image denoising.
arXiv Detail & Related papers (2024-08-16T13:41:34Z) - An Efficient Learning-based Solver Comparable to Metaheuristics for the
Capacitated Arc Routing Problem [67.92544792239086]
We introduce an NN-based solver to significantly narrow the gap with advanced metaheuristics.
First, we propose direction-aware facilitating attention model (DaAM) to incorporate directionality into the embedding process.
Second, we design a supervised reinforcement learning scheme that involves supervised pre-training to establish a robust initial policy.
arXiv Detail & Related papers (2024-03-11T02:17:42Z) - The Cascaded Forward Algorithm for Neural Network Training [61.06444586991505]
We propose a new learning framework for neural networks, namely Cascaded Forward (CaFo) algorithm, which does not rely on BP optimization as that in FF.
Unlike FF, our framework directly outputs label distributions at each cascaded block, which does not require generation of additional negative samples.
In our framework each block can be trained independently, so it can be easily deployed into parallel acceleration systems.
arXiv Detail & Related papers (2023-03-17T02:01:11Z) - M$^2$DQN: A Robust Method for Accelerating Deep Q-learning Network [6.689964384669018]
We propose a framework which uses the Max-Mean loss in Deep Q-Network (M$2$DQN)
Instead of sampling one batch of experiences in the training step, we sample several batches from the experience replay and update the parameters such as the maximum TD-error of these batches is minimized.
We verify the effectiveness of this framework with one of the most widely used techniques, Double DQN (DDQN) in several gym games.
arXiv Detail & Related papers (2022-09-16T09:20:35Z) - Provable Acceleration of Nesterov's Accelerated Gradient Method over Heavy Ball Method in Training Over-Parameterized Neural Networks [12.475834086073734]
First-order gradient method has been extensively employed in training neural networks.
Recent research has proved that the first neural-order method is capable of attaining a global minimum convergence.
arXiv Detail & Related papers (2022-08-08T07:13:26Z) - Backward Gradient Normalization in Deep Neural Networks [68.8204255655161]
We introduce a new technique for gradient normalization during neural network training.
The gradients are rescaled during the backward pass using normalization layers introduced at certain points within the network architecture.
Results on tests with very deep neural networks show that the new technique can do an effective control of the gradient norm.
arXiv Detail & Related papers (2021-06-17T13:24:43Z) - Local Critic Training for Model-Parallel Learning of Deep Neural
Networks [94.69202357137452]
We propose a novel model-parallel learning method, called local critic training.
We show that the proposed approach successfully decouples the update process of the layer groups for both convolutional neural networks (CNNs) and recurrent neural networks (RNNs)
We also show that trained networks by the proposed method can be used for structural optimization.
arXiv Detail & Related papers (2021-02-03T09:30:45Z) - Cross Learning in Deep Q-Networks [82.20059754270302]
We propose a novel cross Q-learning algorithm, aim at alleviating the well-known overestimation problem in value-based reinforcement learning methods.
Our algorithm builds on double Q-learning, by maintaining a set of parallel models and estimate the Q-value based on a randomly selected network.
arXiv Detail & Related papers (2020-09-29T04:58:17Z) - Deep Networks with Fast Retraining [0.0]
This paper proposes a novel MP inverse-based fast retraining strategy for deep convolutional neural network (DCNN) learning.
In each training, a random learning strategy that controls the number of convolutional layers trained in the backward pass is first utilized.
Then, an MP inverse-based batch-by-batch learning strategy, which enables the network to be implemented without access to industrial-scale computational resources, is developed.
arXiv Detail & Related papers (2020-08-13T15:17:38Z) - Variance Reduction for Deep Q-Learning using Stochastic Recursive
Gradient [51.880464915253924]
Deep Q-learning algorithms often suffer from poor gradient estimations with an excessive variance.
This paper introduces the framework for updating the gradient estimates in deep Q-learning, achieving a novel algorithm called SRG-DQN.
arXiv Detail & Related papers (2020-07-25T00:54:20Z) - Tune smarter not harder: A principled approach to tuning learning rates
for shallow nets [13.203765985718201]
principled approach to choosing the learning rate is proposed for shallow feedforward neural networks.
It is shown through simulations that the proposed search method significantly outperforms the existing tuning methods.
arXiv Detail & Related papers (2020-03-22T09:38:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.