Inverse-Dirichlet Weighting Enables Reliable Training of Physics
Informed Neural Networks
- URL: http://arxiv.org/abs/2107.00940v1
- Date: Fri, 2 Jul 2021 10:01:37 GMT
- Title: Inverse-Dirichlet Weighting Enables Reliable Training of Physics
Informed Neural Networks
- Authors: Suryanarayana Maddu, Dominik Sturm, Christian L. M\"uller, Ivo F.
Sbalzarini
- Abstract summary: We describe and remedy a failure mode that may arise from multi-scale dynamics with scale imbalances during training of deep neural networks.
PINNs are popular machine-learning templates that allow for seamless integration of physical equation models with data.
For inverse modeling using sequential training, we find that inverse-Dirichlet weighting protects a PINN against catastrophic forgetting.
- Score: 2.580765958706854
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: We characterize and remedy a failure mode that may arise from multi-scale
dynamics with scale imbalances during training of deep neural networks, such as
Physics Informed Neural Networks (PINNs). PINNs are popular machine-learning
templates that allow for seamless integration of physical equation models with
data. Their training amounts to solving an optimization problem over a weighted
sum of data-fidelity and equation-fidelity objectives. Conflicts between
objectives can arise from scale imbalances, heteroscedasticity in the data,
stiffness of the physical equation, or from catastrophic interference during
sequential training. We explain the training pathology arising from this and
propose a simple yet effective inverse-Dirichlet weighting strategy to
alleviate the issue. We compare with Sobolev training of neural networks,
providing the baseline of analytically $\boldsymbol{\epsilon}$-optimal
training. We demonstrate the effectiveness of inverse-Dirichlet weighting in
various applications, including a multi-scale model of active turbulence, where
we show orders of magnitude improvement in accuracy and convergence over
conventional PINN training. For inverse modeling using sequential training, we
find that inverse-Dirichlet weighting protects a PINN against catastrophic
forgetting.
Related papers
- Neuromimetic metaplasticity for adaptive continual learning [2.1749194587826026]
We propose a metaplasticity model inspired by human working memory to achieve catastrophic forgetting-free continual learning.
A key aspect of our approach involves implementing distinct types of synapses from stable to flexible, and randomly intermixing them to train synaptic connections with different degrees of flexibility.
The model achieved a balanced tradeoff between memory capacity and performance without requiring additional training or structural modifications.
arXiv Detail & Related papers (2024-07-09T12:21:35Z) - Multi-fidelity physics constrained neural networks for dynamical systems [16.6396704642848]
We propose the Multi-Scale Physics-Constrained Neural Network (MSPCNN)
MSPCNN offers a novel methodology for incorporating data with different levels of fidelity into a unified latent space.
Unlike conventional methods, MSPCNN also manages to employ multi-fidelity data to train the predictive model.
arXiv Detail & Related papers (2024-02-03T05:05:26Z) - Analyzing and Improving the Training Dynamics of Diffusion Models [36.37845647984578]
We identify and rectify several causes for uneven and ineffective training in the popular ADM diffusion model architecture.
We find that systematic application of this philosophy eliminates the observed drifts and imbalances, resulting in considerably better networks at equal computational complexity.
arXiv Detail & Related papers (2023-12-05T11:55:47Z) - Towards a Better Theoretical Understanding of Independent Subnetwork Training [56.24689348875711]
We take a closer theoretical look at Independent Subnetwork Training (IST)
IST is a recently proposed and highly effective technique for solving the aforementioned problems.
We identify fundamental differences between IST and alternative approaches, such as distributed methods with compressed communication.
arXiv Detail & Related papers (2023-06-28T18:14:22Z) - Implicit Stochastic Gradient Descent for Training Physics-informed
Neural Networks [51.92362217307946]
Physics-informed neural networks (PINNs) have effectively been demonstrated in solving forward and inverse differential equation problems.
PINNs are trapped in training failures when the target functions to be approximated exhibit high-frequency or multi-scale features.
In this paper, we propose to employ implicit gradient descent (ISGD) method to train PINNs for improving the stability of training process.
arXiv Detail & Related papers (2023-03-03T08:17:47Z) - SPIDE: A Purely Spike-based Method for Training Feedback Spiking Neural
Networks [56.35403810762512]
Spiking neural networks (SNNs) with event-based computation are promising brain-inspired models for energy-efficient applications on neuromorphic hardware.
We study spike-based implicit differentiation on the equilibrium state (SPIDE) that extends the recently proposed training method.
arXiv Detail & Related papers (2023-02-01T04:22:59Z) - Dual adaptive training of photonic neural networks [30.86507809437016]
Photonic neural network (PNN) computes with photons instead of electrons to feature low latency, high energy efficiency, and high parallelism.
Existing training approaches cannot address the extensive accumulation of systematic errors in large-scale PNNs.
We propose dual adaptive training ( DAT) that allows the PNN model to adapt to substantial systematic errors.
arXiv Detail & Related papers (2022-12-09T05:03:45Z) - Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency.
We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z) - Neural Galerkin Schemes with Active Learning for High-Dimensional
Evolution Equations [44.89798007370551]
This work proposes Neural Galerkin schemes based on deep learning that generate training data with active learning for numerically solving high-dimensional partial differential equations.
Neural Galerkin schemes build on the Dirac-Frenkel variational principle to train networks by minimizing the residual sequentially over time.
Our finding is that the active form of gathering training data of the proposed Neural Galerkin schemes is key for numerically realizing the expressive power of networks in high dimensions.
arXiv Detail & Related papers (2022-03-02T19:09:52Z) - Neural networks with late-phase weights [66.72777753269658]
We show that the solutions found by SGD can be further improved by ensembling a subset of the weights in late stages of learning.
At the end of learning, we obtain back a single model by taking a spatial average in weight space.
arXiv Detail & Related papers (2020-07-25T13:23:37Z) - Understanding the Effects of Data Parallelism and Sparsity on Neural
Network Training [126.49572353148262]
We study two factors in neural network training: data parallelism and sparsity.
Despite their promising benefits, understanding of their effects on neural network training remains elusive.
arXiv Detail & Related papers (2020-03-25T10:49:22Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.