Related papers: Randomized Sparse Neural Galerkin Schemes for Solving Evolution Equations with Deep Networks

Randomized Sparse Neural Galerkin Schemes for Solving Evolution Equations with Deep Networks

URL: http://arxiv.org/abs/2310.04867v1
Date: Sat, 7 Oct 2023 16:27:00 GMT
Title: Randomized Sparse Neural Galerkin Schemes for Solving Evolution Equations with Deep Networks
Authors: Jules Berman, Benjamin Peherstorfer
Abstract summary: This work introduces Neural Galerkin schemes that update randomized sparse subsets of network parameters at each time step. The randomization avoids overfitting locally in time and so helps prevent the error from accumulating quickly over the sequential-in-time training. In numerical experiments with a wide range of evolution equations, the proposed scheme with randomized sparse updates is up to two orders of magnitude more accurate at a fixed computational budget.
Score: 1.5281057386824812
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Training neural networks sequentially in time to approximate solution fields of time-dependent partial differential equations can be beneficial for preserving causality and other physics properties; however, the sequential-in-time training is numerically challenging because training errors quickly accumulate and amplify over time. This work introduces Neural Galerkin schemes that update randomized sparse subsets of network parameters at each time step. The randomization avoids overfitting locally in time and so helps prevent the error from accumulating quickly over the sequential-in-time training, which is motivated by dropout that addresses a similar issue of overfitting due to neuron co-adaptation. The sparsity of the update reduces the computational costs of training without losing expressiveness because many of the network parameters are redundant locally at each time step. In numerical experiments with a wide range of evolution equations, the proposed scheme with randomized sparse updates is up to two orders of magnitude more accurate at a fixed computational budget and up to two orders of magnitude faster at a fixed accuracy than schemes with dense updates.

Related papers

MultiPDENet: PDE-embedded Learning with Multi-time-stepping for Accelerated Flow Simulation [48.41289705783405]
We propose a PDE-embedded network with multiscale time stepping (MultiPDENet) In particular, we design a convolutional filter based on the structure of finite difference with a small number of parameters to optimize. A Physics Block with a 4th-order Runge-Kutta integrator at the fine time scale is established that embeds the structure of PDEs to guide the prediction.
arXiv Detail & Related papers (2025-01-27T12:15:51Z)
Gradient-free training of recurrent neural networks [3.272216546040443]
We introduce a computational approach to construct all weights and biases of a recurrent neural network without using gradient-based methods. The approach is based on a combination of random feature networks and Koopman operator theory for dynamical systems. In computational experiments on time series, forecasting for chaotic dynamical systems, and control problems, we observe that the training time and forecasting accuracy of the recurrent neural networks we construct are improved.
arXiv Detail & Related papers (2024-10-30T21:24:34Z)
Temporal Convolution Derived Multi-Layered Reservoir Computing [5.261277318790788]
We propose a new mapping of input data into the reservoir's state space. We incorporate this method in two novel network architectures increasing parallelizability, depth and predictive capabilities of the neural network. For the chaotic time series, we observe an error reduction of up to $85.45%$ compared to Echo State Networks and $90.72%$ compared to Gated Recurrent Units.
arXiv Detail & Related papers (2024-07-09T11:40:46Z)
Time-Parameterized Convolutional Neural Networks for Irregularly Sampled Time Series [26.77596449192451]
Irregularly sampled time series are ubiquitous in several application domains, leading to sparse, not fully-observed and non-aligned observations. Standard sequential neural networks (RNNs) and convolutional neural networks (CNNs) consider regular spacing between observation times, posing significant challenges to irregular time series modeling. We parameterize convolutional layers by employing time-explicitly irregular kernels.
arXiv Detail & Related papers (2023-08-06T21:10:30Z)
Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency. We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z)
Training Feedback Spiking Neural Networks by Implicit Differentiation on the Equilibrium State [66.2457134675891]
Spiking neural networks (SNNs) are brain-inspired models that enable energy-efficient implementation on neuromorphic hardware. Most existing methods imitate the backpropagation framework and feedforward architectures for artificial neural networks. We propose a novel training method that does not rely on the exact reverse of the forward computation.
arXiv Detail & Related papers (2021-09-29T07:46:54Z)
Oscillatory Fourier Neural Network: A Compact and Efficient Architecture for Sequential Processing [16.69710555668727]
We propose a novel neuron model that has cosine activation with a time varying component for sequential processing. The proposed neuron provides an efficient building block for projecting sequential inputs into spectral domain. Applying the proposed model to sentiment analysis on IMDB dataset reaches 89.4% test accuracy within 5 epochs.
arXiv Detail & Related papers (2021-09-14T19:08:07Z)
What training reveals about neural network complexity [80.87515604428346]
This work explores the hypothesis that the complexity of the function a deep neural network (NN) is learning can be deduced by how fast its weights change during training. Our results support the hypothesis that good training behavior can be a useful bias towards good generalization.
arXiv Detail & Related papers (2021-06-08T08:58:00Z)
LocalDrop: A Hybrid Regularization for Deep Neural Networks [98.30782118441158]
We propose a new approach for the regularization of neural networks by the local Rademacher complexity called LocalDrop. A new regularization function for both fully-connected networks (FCNs) and convolutional neural networks (CNNs) has been developed based on the proposed upper bound of the local Rademacher complexity.
arXiv Detail & Related papers (2021-03-01T03:10:11Z)
Convergence rates for gradient descent in the training of overparameterized artificial neural networks with biases [3.198144010381572]
In recent years, artificial neural networks have developed into a powerful tool for dealing with a multitude of problems for which classical solution approaches. It is still unclear why randomly gradient descent algorithms reach their limits.
arXiv Detail & Related papers (2021-02-23T18:17:47Z)
Predicting Training Time Without Training [120.92623395389255]
We tackle the problem of predicting the number of optimization steps that a pre-trained deep network needs to converge to a given value of the loss function. We leverage the fact that the training dynamics of a deep network during fine-tuning are well approximated by those of a linearized model. We are able to predict the time it takes to fine-tune a model to a given loss without having to perform any training.
arXiv Detail & Related papers (2020-08-28T04:29:54Z)
Liquid Time-constant Networks [117.57116214802504]
We introduce a new class of time-continuous recurrent neural network models. Instead of declaring a learning system's dynamics by implicit nonlinearities, we construct networks of linear first-order dynamical systems. These neural networks exhibit stable and bounded behavior, yield superior expressivity within the family of neural ordinary differential equations.
arXiv Detail & Related papers (2020-06-08T09:53:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.