Using a novel fractional-order gradient method for CNN back-propagation
- URL: http://arxiv.org/abs/2205.00581v1
- Date: Sun, 1 May 2022 23:38:06 GMT
- Title: Using a novel fractional-order gradient method for CNN back-propagation
- Authors: Mundher Mohammed Taresh, Ningbo Zhu, Talal Ahmed Ali Ali, Mohammed
Alghaili and Weihua Guo
- Abstract summary: Researchers propose a novel deep learning model and apply it to COVID-19 diagnosis.
Model uses the tool of fractional calculus, which has the potential to improve the performance of gradient methods.
- Score: 1.6679382332181059
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Computer-aided diagnosis tools have experienced rapid growth and development
in recent years. Among all, deep learning is the most sophisticated and popular
tool. In this paper, researchers propose a novel deep learning model and apply
it to COVID-19 diagnosis. Our model uses the tool of fractional calculus, which
has the potential to improve the performance of gradient methods. To this end,
the researcher proposes a fractional-order gradient method for the
back-propagation of convolutional neural networks based on the Caputo
definition. However, if only the first term of the infinite series of the
Caputo definition is used to approximate the fractional-order derivative, the
length of the memory is truncated. Therefore, the fractional-order gradient
(FGD) method with a fixed memory step and an adjustable number of terms is used
to update the weights of the layers. Experiments were performed on the COVIDx
dataset to demonstrate fast convergence, good accuracy, and the ability to
bypass the local optimal point. We also compared the performance of the
developed fractional-order neural networks and Integer-order neural networks.
The results confirmed the effectiveness of our proposed model in the diagnosis
of COVID-19.
Related papers
- Fractional-order spike-timing-dependent gradient descent for multi-layer spiking neural networks [18.142378139047977]
This paper proposes a fractional-order spike-timing-dependent gradient descent (FOSTDGD) learning model.
It is tested on theNIST and DVS128 Gesture datasets and its accuracy under different network structure and fractional orders is analyzed.
arXiv Detail & Related papers (2024-10-20T05:31:34Z) - An Explainable Deep Learning-Based Method For Schizophrenia Diagnosis Using Generative Data-Augmentation [0.3222802562733786]
We leverage a deep learning-based method for the automatic diagnosis of schizophrenia using EEG brain recordings.
This approach utilizes generative data augmentation, a powerful technique that enhances the accuracy of the diagnosis.
arXiv Detail & Related papers (2023-10-25T12:55:16Z) - Low-rank extended Kalman filtering for online learning of neural
networks from streaming data [71.97861600347959]
We propose an efficient online approximate Bayesian inference algorithm for estimating the parameters of a nonlinear function from a potentially non-stationary data stream.
The method is based on the extended Kalman filter (EKF), but uses a novel low-rank plus diagonal decomposition of the posterior matrix.
In contrast to methods based on variational inference, our method is fully deterministic, and does not require step-size tuning.
arXiv Detail & Related papers (2023-05-31T03:48:49Z) - A Bootstrap Algorithm for Fast Supervised Learning [0.0]
Training a neural network (NN) typically relies on some type of curve-following method, such as gradient descent (and gradient descent (SGD)), ADADELTA, ADAM or limited memory algorithms.
Convergence for these algorithms usually relies on having access to a large quantity of observations in order to achieve a high level of accuracy and, with certain classes of functions, these algorithms could take multiple epochs of data points to catch on.
Herein, a different technique with the potential of achieving dramatically better speeds of convergence is explored: it does not curve-follow but rather relies on 'decoupling' hidden layers and on
arXiv Detail & Related papers (2023-05-04T18:28:18Z) - Implicit Bias in Leaky ReLU Networks Trained on High-Dimensional Data [63.34506218832164]
In this work, we investigate the implicit bias of gradient flow and gradient descent in two-layer fully-connected neural networks with ReLU activations.
For gradient flow, we leverage recent work on the implicit bias for homogeneous neural networks to show that leakyally, gradient flow produces a neural network with rank at most two.
For gradient descent, provided the random variance is small enough, we show that a single step of gradient descent suffices to drastically reduce the rank of the network, and that the rank remains small throughout training.
arXiv Detail & Related papers (2022-10-13T15:09:54Z) - Deep Manifold Learning with Graph Mining [80.84145791017968]
We propose a novel graph deep model with a non-gradient decision layer for graph mining.
The proposed model has achieved state-of-the-art performance compared to the current models.
arXiv Detail & Related papers (2022-07-18T04:34:08Z) - Invariance Learning in Deep Neural Networks with Differentiable Laplace
Approximations [76.82124752950148]
We develop a convenient gradient-based method for selecting the data augmentation.
We use a differentiable Kronecker-factored Laplace approximation to the marginal likelihood as our objective.
arXiv Detail & Related papers (2022-02-22T02:51:11Z) - Optimization-Based Separations for Neural Networks [57.875347246373956]
We show that gradient descent can efficiently learn ball indicator functions using a depth 2 neural network with two layers of sigmoidal activations.
This is the first optimization-based separation result where the approximation benefits of the stronger architecture provably manifest in practice.
arXiv Detail & Related papers (2021-12-04T18:07:47Z) - Non-Gradient Manifold Neural Network [79.44066256794187]
Deep neural network (DNN) generally takes thousands of iterations to optimize via gradient descent.
We propose a novel manifold neural network based on non-gradient optimization.
arXiv Detail & Related papers (2021-06-15T06:39:13Z) - Research of Damped Newton Stochastic Gradient Descent Method for Neural
Network Training [6.231508838034926]
First-order methods like gradient descent(SGD) are recently the popular optimization method to train deep neural networks (DNNs)
In this paper, we propose the Damped Newton Descent(DN-SGD) and Gradient Descent Damped Newton(SGD-DN) methods to train DNNs for regression problems with Mean Square Error(MSE) and classification problems with Cross-Entropy Loss(CEL)
Our methods just accurately compute a small part of the parameters, which greatly reduces the computational cost and makes the learning process much faster and more accurate than SGD.
arXiv Detail & Related papers (2021-03-31T02:07:18Z) - COVID-CLNet: COVID-19 Detection with Compressive Deep Learning
Approaches [0.0]
We propose a computer-aided detection (CADe) system that uses the computed tomography (CT) scan images.
This proposed boosted deep learning network (CLNet) is based on the implementation of Deep Learning (DL) networks.
Experiments performed on different compressed methods show promising results for COVID-19 detection.
arXiv Detail & Related papers (2020-12-03T19:56:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.