Randomized Sparse Neural Galerkin Schemes for Solving Evolution
Equations with Deep Networks
- URL: http://arxiv.org/abs/2310.04867v1
- Date: Sat, 7 Oct 2023 16:27:00 GMT
- Title: Randomized Sparse Neural Galerkin Schemes for Solving Evolution
Equations with Deep Networks
- Authors: Jules Berman, Benjamin Peherstorfer
- Abstract summary: This work introduces Neural Galerkin schemes that update randomized sparse subsets of network parameters at each time step.
The randomization avoids overfitting locally in time and so helps prevent the error from accumulating quickly over the sequential-in-time training.
In numerical experiments with a wide range of evolution equations, the proposed scheme with randomized sparse updates is up to two orders of magnitude more accurate at a fixed computational budget.
- Score: 1.5281057386824812
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Training neural networks sequentially in time to approximate solution fields
of time-dependent partial differential equations can be beneficial for
preserving causality and other physics properties; however, the
sequential-in-time training is numerically challenging because training errors
quickly accumulate and amplify over time. This work introduces Neural Galerkin
schemes that update randomized sparse subsets of network parameters at each
time step. The randomization avoids overfitting locally in time and so helps
prevent the error from accumulating quickly over the sequential-in-time
training, which is motivated by dropout that addresses a similar issue of
overfitting due to neuron co-adaptation. The sparsity of the update reduces the
computational costs of training without losing expressiveness because many of
the network parameters are redundant locally at each time step. In numerical
experiments with a wide range of evolution equations, the proposed scheme with
randomized sparse updates is up to two orders of magnitude more accurate at a
fixed computational budget and up to two orders of magnitude faster at a fixed
accuracy than schemes with dense updates.
Related papers
- Gradient-free training of recurrent neural networks [3.272216546040443]
We introduce a computational approach to construct all weights and biases of a recurrent neural network without using gradient-based methods.
The approach is based on a combination of random feature networks and Koopman operator theory for dynamical systems.
In computational experiments on time series, forecasting for chaotic dynamical systems, and control problems, we observe that the training time and forecasting accuracy of the recurrent neural networks we construct are improved.
arXiv Detail & Related papers (2024-10-30T21:24:34Z) - Temporal Convolution Derived Multi-Layered Reservoir Computing [5.261277318790788]
We propose a new mapping of input data into the reservoir's state space.
We incorporate this method in two novel network architectures increasing parallelizability, depth and predictive capabilities of the neural network.
For the chaotic time series, we observe an error reduction of up to $85.45%$ compared to Echo State Networks and $90.72%$ compared to Gated Recurrent Units.
arXiv Detail & Related papers (2024-07-09T11:40:46Z) - Time-Parameterized Convolutional Neural Networks for Irregularly Sampled
Time Series [26.77596449192451]
Irregularly sampled time series are ubiquitous in several application domains, leading to sparse, not fully-observed and non-aligned observations.
Standard sequential neural networks (RNNs) and convolutional neural networks (CNNs) consider regular spacing between observation times, posing significant challenges to irregular time series modeling.
We parameterize convolutional layers by employing time-explicitly irregular kernels.
arXiv Detail & Related papers (2023-08-06T21:10:30Z) - Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency.
We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z) - Training Feedback Spiking Neural Networks by Implicit Differentiation on
the Equilibrium State [66.2457134675891]
Spiking neural networks (SNNs) are brain-inspired models that enable energy-efficient implementation on neuromorphic hardware.
Most existing methods imitate the backpropagation framework and feedforward architectures for artificial neural networks.
We propose a novel training method that does not rely on the exact reverse of the forward computation.
arXiv Detail & Related papers (2021-09-29T07:46:54Z) - Oscillatory Fourier Neural Network: A Compact and Efficient Architecture
for Sequential Processing [16.69710555668727]
We propose a novel neuron model that has cosine activation with a time varying component for sequential processing.
The proposed neuron provides an efficient building block for projecting sequential inputs into spectral domain.
Applying the proposed model to sentiment analysis on IMDB dataset reaches 89.4% test accuracy within 5 epochs.
arXiv Detail & Related papers (2021-09-14T19:08:07Z) - What training reveals about neural network complexity [80.87515604428346]
This work explores the hypothesis that the complexity of the function a deep neural network (NN) is learning can be deduced by how fast its weights change during training.
Our results support the hypothesis that good training behavior can be a useful bias towards good generalization.
arXiv Detail & Related papers (2021-06-08T08:58:00Z) - LocalDrop: A Hybrid Regularization for Deep Neural Networks [98.30782118441158]
We propose a new approach for the regularization of neural networks by the local Rademacher complexity called LocalDrop.
A new regularization function for both fully-connected networks (FCNs) and convolutional neural networks (CNNs) has been developed based on the proposed upper bound of the local Rademacher complexity.
arXiv Detail & Related papers (2021-03-01T03:10:11Z) - Convergence rates for gradient descent in the training of
overparameterized artificial neural networks with biases [3.198144010381572]
In recent years, artificial neural networks have developed into a powerful tool for dealing with a multitude of problems for which classical solution approaches.
It is still unclear why randomly gradient descent algorithms reach their limits.
arXiv Detail & Related papers (2021-02-23T18:17:47Z) - Predicting Training Time Without Training [120.92623395389255]
We tackle the problem of predicting the number of optimization steps that a pre-trained deep network needs to converge to a given value of the loss function.
We leverage the fact that the training dynamics of a deep network during fine-tuning are well approximated by those of a linearized model.
We are able to predict the time it takes to fine-tune a model to a given loss without having to perform any training.
arXiv Detail & Related papers (2020-08-28T04:29:54Z) - Liquid Time-constant Networks [117.57116214802504]
We introduce a new class of time-continuous recurrent neural network models.
Instead of declaring a learning system's dynamics by implicit nonlinearities, we construct networks of linear first-order dynamical systems.
These neural networks exhibit stable and bounded behavior, yield superior expressivity within the family of neural ordinary differential equations.
arXiv Detail & Related papers (2020-06-08T09:53:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.