Newton methods based convolution neural networks using parallel
processing
- URL: http://arxiv.org/abs/2112.01401v3
- Date: Wed, 5 Apr 2023 08:39:02 GMT
- Title: Newton methods based convolution neural networks using parallel
processing
- Authors: Ujjwal Thakur, Anuj Sharma
- Abstract summary: Training of convolutional neural networks is a high dimensional and a non- parametric optimization problem.
Newton methods for convolutional neural networks deals with this by using sub-sampled Hessian Newton methods.
We have used parallel processing instead of serial processing in mini-batch computations.
- Score: 3.9220281834178463
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Training of convolutional neural networks is a high dimensional and a
non-convex optimization problem. At present, it is inefficient in situations
where parametric learning rates can not be confidently set. Some past works
have introduced Newton methods for training deep neural networks. Newton
methods for convolutional neural networks involve complicated operations.
Finding the Hessian matrix in second-order methods becomes very complex as we
mainly use the finite differences method with the image data. Newton methods
for convolutional neural networks deals with this by using the sub-sampled
Hessian Newton methods. In this paper, we have used the complete data instead
of the sub-sampled methods that only handle partial data at a time. Further, we
have used parallel processing instead of serial processing in mini-batch
computations. The results obtained using parallel processing in this study,
outperform the time taken by the previous approach.
Related papers
- Newton Losses: Using Curvature Information for Learning with Differentiable Algorithms [80.37846867546517]
We show how to train eight different neural networks with custom objectives.
We exploit their second-order information via their empirical Fisherssian matrices.
We apply Loss Lossiable algorithms to achieve significant improvements for less differentiable algorithms.
arXiv Detail & Related papers (2024-10-24T18:02:11Z) - On Newton's Method to Unlearn Neural Networks [44.85793893441989]
We seek approximate unlearning algorithms for neural networks (NNs) that return identical models to the retrained oracle.
We propose CureNewton's method, a principle approach that leverages cubic regularization to handle the Hessian degeneracy effectively.
Experiments across different models and datasets show that our method can achieve competitive unlearning performance to the state-of-the-art algorithm in practical unlearning settings.
arXiv Detail & Related papers (2024-06-20T17:12:20Z) - An Initialization Schema for Neuronal Networks on Tabular Data [0.9155684383461983]
We show that a binomial neural network can be used effectively on tabular data.
The proposed approach shows a simple but effective approach for initializing the first hidden layer in neural networks.
We evaluate our approach on multiple public datasets and showcase the improved performance compared to other neural network-based approaches.
arXiv Detail & Related papers (2023-11-07T13:52:35Z) - Embedding stochastic differential equations into neural networks via
dual processes [0.0]
We propose a new approach to constructing a neural network for predicting expectations of differential equations.
The proposed method does not need data sets of inputs and outputs.
As a demonstration, we construct neural networks for the Ornstein-Uhlenbeck process and the noisy van der Pol system.
arXiv Detail & Related papers (2023-06-08T00:50:16Z) - Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency.
We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z) - Free Probability, Newton lilypads and Jacobians of neural networks [0.0]
We present a reliable and very fast method for computing the associated spectral densities.
Our technique is based on an adaptative Newton-Raphson scheme, by finding and chaining basins of attraction.
arXiv Detail & Related papers (2021-11-01T11:22:42Z) - Training Feedback Spiking Neural Networks by Implicit Differentiation on
the Equilibrium State [66.2457134675891]
Spiking neural networks (SNNs) are brain-inspired models that enable energy-efficient implementation on neuromorphic hardware.
Most existing methods imitate the backpropagation framework and feedforward architectures for artificial neural networks.
We propose a novel training method that does not rely on the exact reverse of the forward computation.
arXiv Detail & Related papers (2021-09-29T07:46:54Z) - Local Extreme Learning Machines and Domain Decomposition for Solving
Linear and Nonlinear Partial Differential Equations [0.0]
We present a neural network-based method for solving linear and nonlinear partial differential equations.
The method combines the ideas of extreme learning machines (ELM), domain decomposition and local neural networks.
We compare the current method with the deep Galerkin method (DGM) and the physics-informed neural network (PINN) in terms of the accuracy and computational cost.
arXiv Detail & Related papers (2020-12-04T23:19:39Z) - Parallelization Techniques for Verifying Neural Networks [52.917845265248744]
We introduce an algorithm based on the verification problem in an iterative manner and explore two partitioning strategies.
We also introduce a highly parallelizable pre-processing algorithm that uses the neuron activation phases to simplify the neural network verification problems.
arXiv Detail & Related papers (2020-04-17T20:21:47Z) - Accelerating Feedforward Computation via Parallel Nonlinear Equation
Solving [106.63673243937492]
Feedforward computation, such as evaluating a neural network or sampling from an autoregressive model, is ubiquitous in machine learning.
We frame the task of feedforward computation as solving a system of nonlinear equations. We then propose to find the solution using a Jacobi or Gauss-Seidel fixed-point method, as well as hybrid methods of both.
Our method is guaranteed to give exactly the same values as the original feedforward computation with a reduced (or equal) number of parallelizable iterations, and hence reduced time given sufficient parallel computing power.
arXiv Detail & Related papers (2020-02-10T10:11:31Z) - On the distance between two neural networks and the stability of
learning [59.62047284234815]
This paper relates parameter distance to gradient breakdown for a broad class of nonlinear compositional functions.
The analysis leads to a new distance function called deep relative trust and a descent lemma for neural networks.
arXiv Detail & Related papers (2020-02-09T19:18:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.