Related papers: Accelerating Training in Artificial Neural Networks with Dynamic Mode Decomposition

Accelerating Training in Artificial Neural Networks with Dynamic Mode Decomposition

URL: http://arxiv.org/abs/2006.14371v1
Date: Thu, 18 Jun 2020 22:59:55 GMT
Title: Accelerating Training in Artificial Neural Networks with Dynamic Mode Decomposition
Authors: Mauricio E. Tano, Gavin D. Portwood, Jean C. Ragusa
Abstract summary: We propose a method to decouple the evaluation of the update rule at each weight. By fine-tuning the number of backpropagation steps used for each DMD model estimation, a significant reduction in the number of operations required to train the neural networks can be achieved.
Score: 0.0
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Training of deep neural networks (DNNs) frequently involves optimizing several millions or even billions of parameters. Even with modern computing architectures, the computational expense of DNN training can inhibit, for instance, network architecture design optimization, hyper-parameter studies, and integration into scientific research cycles. The key factor limiting performance is that both the feed-forward evaluation and the back-propagation rule are needed for each weight during optimization in the update rule. In this work, we propose a method to decouple the evaluation of the update rule at each weight. At first, Proper Orthogonal Decomposition (POD) is used to identify a current estimate of the principal directions of evolution of weights per layer during training based on the evolution observed with a few backpropagation steps. Then, Dynamic Mode Decomposition (DMD) is used to learn the dynamics of the evolution of the weights in each layer according to these principal directions. The DMD model is used to evaluate an approximate converged state when training the ANN. Afterward, some number of backpropagation steps are performed, starting from the DMD estimates, leading to an update to the principal directions and DMD model. This iterative process is repeated until convergence. By fine-tuning the number of backpropagation steps used for each DMD model estimation, a significant reduction in the number of operations required to train the neural networks can be achieved. In this paper, the DMD acceleration method will be explained in detail, along with the theoretical justification for the acceleration provided by DMD. This method is illustrated using a regression problem of key interest for the scientific machine learning community: the prediction of a pollutant concentration field in a diffusion, advection, reaction problem.

Related papers

Deep-Unrolling Multidimensional Harmonic Retrieval Algorithms on Neuromorphic Hardware [78.17783007774295]
This paper explores the potential of conversion-based neuromorphic algorithms for highly accurate and energy-efficient single-snapshot multidimensional harmonic retrieval. A novel method for converting the complex-valued convolutional layers and activations into spiking neural networks (SNNs) is developed. The converted SNNs achieve almost five-fold power efficiency at moderate performance loss compared to the original CNNs.
arXiv Detail & Related papers (2024-12-05T09:41:33Z)
Analyzing and Improving the Training Dynamics of Diffusion Models [36.37845647984578]
We identify and rectify several causes for uneven and ineffective training in the popular ADM diffusion model architecture. We find that systematic application of this philosophy eliminates the observed drifts and imbalances, resulting in considerably better networks at equal computational complexity.
arXiv Detail & Related papers (2023-12-05T11:55:47Z)
Dynamic Tensor Decomposition via Neural Diffusion-Reaction Processes [24.723536390322582]
tensor decomposition is an important tool for multiway data analysis. We propose Dynamic EMbedIngs fOr dynamic algorithm dEcomposition (DEMOTE) We show the advantage of our approach in both simulation study and real-world applications.
arXiv Detail & Related papers (2023-10-30T15:49:45Z)
Score dynamics: scaling molecular dynamics with picoseconds timestep via conditional diffusion model [5.39025059364831]
We propose score dynamics (SD), a framework for learning accelerated evolution operators with large timesteps from molecular-dynamics simulations. We construct graph neural network based score dynamics models of realistic molecular systems that are evolved with 10ps timesteps. Our current SD implementation is about two orders of magnitude faster than the MD counterpart for the systems studied in this work.
arXiv Detail & Related papers (2023-10-02T22:29:45Z)
Implicit Stochastic Gradient Descent for Training Physics-informed Neural Networks [51.92362217307946]
Physics-informed neural networks (PINNs) have effectively been demonstrated in solving forward and inverse differential equation problems. PINNs are trapped in training failures when the target functions to be approximated exhibit high-frequency or multi-scale features. In this paper, we propose to employ implicit gradient descent (ISGD) method to train PINNs for improving the stability of training process.
arXiv Detail & Related papers (2023-03-03T08:17:47Z)
A predictive physics-aware hybrid reduced order model for reacting flows [65.73506571113623]
A new hybrid predictive Reduced Order Model (ROM) is proposed to solve reacting flow problems. The number of degrees of freedom is reduced from thousands of temporal points to a few POD modes with their corresponding temporal coefficients. Two different deep learning architectures have been tested to predict the temporal coefficients.
arXiv Detail & Related papers (2023-01-24T08:39:20Z)
Forecasting through deep learning and modal decomposition in two-phase concentric jets [2.362412515574206]
This work aims to improve fuel chamber injectors' performance in turbofan engines. It requires the development of models that allow real-time prediction and improvement of the fuel/air mixture.
arXiv Detail & Related papers (2022-12-24T12:59:41Z)
Coupled and Uncoupled Dynamic Mode Decomposition in Multi-Compartmental Systems with Applications to Epidemiological and Additive Manufacturing Problems [58.720142291102135]
We show that Dynamic Decomposition (DMD) may be a powerful tool when applied to nonlinear problems. In particular, we show interesting numerical applications to a continuous delayed-SIRD model for Covid-19.
arXiv Detail & Related papers (2021-10-12T21:42:14Z)
Training Feedback Spiking Neural Networks by Implicit Differentiation on the Equilibrium State [66.2457134675891]
Spiking neural networks (SNNs) are brain-inspired models that enable energy-efficient implementation on neuromorphic hardware. Most existing methods imitate the backpropagation framework and feedforward architectures for artificial neural networks. We propose a novel training method that does not rely on the exact reverse of the forward computation.
arXiv Detail & Related papers (2021-09-29T07:46:54Z)
Decadal Forecasts with ResDMD: a Residual DMD Neural Network [0.0]
Operational forecasting centers are investing in decadal (1-10 year) forecast systems to support long-term decision making for a more climate-resilient society. One method that has previously been employed is the Dynamic Mode Decomposition (DMD) algorithm. We investigate an extension to the DMD that explicitly represents the non-linear terms as a neural network.
arXiv Detail & Related papers (2021-06-21T13:49:43Z)
Dynamic Mode Decomposition in Adaptive Mesh Refinement and Coarsening Simulations [58.720142291102135]
Dynamic Mode Decomposition (DMD) is a powerful data-driven method used to extract coherent schemes. This paper proposes a strategy to enable DMD to extract from observations with different mesh topologies and dimensions.
arXiv Detail & Related papers (2021-04-28T22:14:25Z)
Large-scale Neural Solvers for Partial Differential Equations [48.7576911714538]
Solving partial differential equations (PDE) is an indispensable part of many branches of science as many processes can be modelled in terms of PDEs. Recent numerical solvers require manual discretization of the underlying equation as well as sophisticated, tailored code for distributed computing. We examine the applicability of continuous, mesh-free neural solvers for partial differential equations, physics-informed neural networks (PINNs) We discuss the accuracy of GatedPINN with respect to analytical solutions -- as well as state-of-the-art numerical solvers, such as spectral solvers.
arXiv Detail & Related papers (2020-09-08T13:26:51Z)

This list is automatically generated from the titles and abstracts of the papers in this site.