Enhancing Deep Neural Network Training Efficiency and Performance through Linear Prediction
- URL: http://arxiv.org/abs/2310.10958v2
- Date: Tue, 2 Jul 2024 16:57:06 GMT
- Title: Enhancing Deep Neural Network Training Efficiency and Performance through Linear Prediction
- Authors: Hejie Ying, Mengmeng Song, Yaohong Tang, Shungen Xiao, Zimin Xiao,
- Abstract summary: Deep neural networks (DNN) have achieved remarkable success in various fields, including computer vision and natural language processing.
This paper aims to propose a method to optimize the training effectiveness of DNN, with the goal of improving model performance.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep neural networks (DNN) have achieved remarkable success in various fields, including computer vision and natural language processing. However, training an effective DNN model still poses challenges. This paper aims to propose a method to optimize the training effectiveness of DNN, with the goal of improving model performance. Firstly, based on the observation that the DNN parameters change in certain laws during training process, the potential of parameter prediction for improving model training efficiency and performance is discovered. Secondly, considering the magnitude of DNN model parameters, hardware limitations and characteristics of Stochastic Gradient Descent (SGD) for noise tolerance, a Parameter Linear Prediction (PLP) method is exploit to perform DNN parameter prediction. Finally, validations are carried out on some representative backbones. Experiment results show that compare to the normal training ways, under the same training conditions and epochs, by employing proposed PLP method, the optimal model is able to obtain average about 1% accuracy improvement and 0.01 top-1/top-5 error reduction for Vgg16, Resnet18 and GoogLeNet based on CIFAR-100 dataset, which shown the effectiveness of the proposed method on different DNN structures, and validated its capacity in enhancing DNN training efficiency and performance.
Related papers
- Grad-Instructor: Universal Backpropagation with Explainable Evaluation Neural Networks for Meta-learning and AutoML [0.0]
An Evaluation Neural Network (ENN) is trained via deep reinforcement learning to predict the performance of the target network.
The ENN then works as an additional evaluation function during backpropagation.
arXiv Detail & Related papers (2024-06-15T08:37:51Z) - Online GNN Evaluation Under Test-time Graph Distribution Shifts [92.4376834462224]
A new research problem, online GNN evaluation, aims to provide valuable insights into the well-trained GNNs's ability to generalize to real-world unlabeled graphs.
We develop an effective learning behavior discrepancy score, dubbed LeBeD, to estimate the test-time generalization errors of well-trained GNN models.
arXiv Detail & Related papers (2024-03-15T01:28:08Z) - Control-Theoretic Techniques for Online Adaptation of Deep Neural
Networks in Dynamical Systems [0.0]
Deep neural networks (DNNs) are currently the primary tool in modern artificial intelligence, machine learning, and data science.
In many applications, DNNs are trained offline, through supervised learning or reinforcement learning, and deployed online for inference.
We propose using techniques from control theory to update DNN parameters online.
arXiv Detail & Related papers (2024-02-01T16:51:11Z) - Implicit Stochastic Gradient Descent for Training Physics-informed
Neural Networks [51.92362217307946]
Physics-informed neural networks (PINNs) have effectively been demonstrated in solving forward and inverse differential equation problems.
PINNs are trapped in training failures when the target functions to be approximated exhibit high-frequency or multi-scale features.
In this paper, we propose to employ implicit gradient descent (ISGD) method to train PINNs for improving the stability of training process.
arXiv Detail & Related papers (2023-03-03T08:17:47Z) - Optimising Event-Driven Spiking Neural Network with Regularisation and
Cutoff [33.91830001268308]
Spiking neural network (SNN) offers promising improvements in computational efficiency.
Current SNN training methodologies predominantly employ a fixed timestep approach.
We propose to consider cutoff in SNN, which can terminate SNN anytime during the inference to achieve efficient inference.
arXiv Detail & Related papers (2023-01-23T16:14:09Z) - Recurrent Bilinear Optimization for Binary Neural Networks [58.972212365275595]
BNNs neglect the intrinsic bilinear relationship of real-valued weights and scale factors.
Our work is the first attempt to optimize BNNs from the bilinear perspective.
We obtain robust RBONNs, which show impressive performance over state-of-the-art BNNs on various models and datasets.
arXiv Detail & Related papers (2022-09-04T06:45:33Z) - LDP: Learnable Dynamic Precision for Efficient Deep Neural Network
Training and Inference [24.431074439663437]
Learnable Dynamic Precision (LDP) is a framework that automatically learns a temporally and spatially dynamic precision schedule during training.
LDP consistently outperforms state-of-the-art (SOTA) low precision DNN training techniques in terms of training efficiency and achieved accuracy trade-offs.
arXiv Detail & Related papers (2022-03-15T08:01:46Z) - Enhanced physics-constrained deep neural networks for modeling vanadium
redox flow battery [62.997667081978825]
We propose an enhanced version of the physics-constrained deep neural network (PCDNN) approach to provide high-accuracy voltage predictions.
The ePCDNN can accurately capture the voltage response throughout the charge--discharge cycle, including the tail region of the voltage discharge curve.
arXiv Detail & Related papers (2022-03-03T19:56:24Z) - Adaptive Degradation Process with Deep Learning-Driven Trajectory [5.060233857860902]
Remaining useful life (RUL) estimation is a crucial component in the implementation of intelligent predictive maintenance and health management.
This paper develops a hybrid DNN-based prognostic approach, where a Wiener-based-degradation model is enhanced with adaptive drift to characterize the system degradation.
An LSTM-CNN encoder-decoder is developed to predict future degradation trajectories by jointly learning noise coefficients as well as drift coefficients, and adaptive drift is updated via Bayesian inference.
arXiv Detail & Related papers (2021-03-22T06:00:42Z) - Deep Time Delay Neural Network for Speech Enhancement with Full Data
Learning [60.20150317299749]
This paper proposes a deep time delay neural network (TDNN) for speech enhancement with full data learning.
To make full use of the training data, we propose a full data learning method for speech enhancement.
arXiv Detail & Related papers (2020-11-11T06:32:37Z) - Bayesian Graph Neural Networks with Adaptive Connection Sampling [62.51689735630133]
We propose a unified framework for adaptive connection sampling in graph neural networks (GNNs)
The proposed framework not only alleviates over-smoothing and over-fitting tendencies of deep GNNs, but also enables learning with uncertainty in graph analytic tasks with GNNs.
arXiv Detail & Related papers (2020-06-07T07:06:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.