Stacked networks improve physics-informed training: applications to
neural networks and deep operator networks
- URL: http://arxiv.org/abs/2311.06483v2
- Date: Tue, 21 Nov 2023 04:53:27 GMT
- Title: Stacked networks improve physics-informed training: applications to
neural networks and deep operator networks
- Authors: Amanda A Howard, Sarah H Murphy, Shady E Ahmed, Panos Stinis
- Abstract summary: We present a novel multifidelity framework for stacking physics-informed neural networks and operator networks.
We show how stacking can be used to improve the accuracy and reduce the required size of physics-informed neural networks and operator networks.
- Score: 0.9999629695552196
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Physics-informed neural networks and operator networks have shown promise for
effectively solving equations modeling physical systems. However, these
networks can be difficult or impossible to train accurately for some systems of
equations. We present a novel multifidelity framework for stacking
physics-informed neural networks and operator networks that facilitates
training. We successively build a chain of networks, where the output at one
step can act as a low-fidelity input for training the next step, gradually
increasing the expressivity of the learned model. The equations imposed at each
step of the iterative process can be the same or different (akin to simulated
annealing). The iterative (stacking) nature of the proposed method allows us to
progressively learn features of a solution that are hard to learn directly.
Through benchmark problems including a nonlinear pendulum, the wave equation,
and the viscous Burgers equation, we show how stacking can be used to improve
the accuracy and reduce the required size of physics-informed neural networks
and operator networks.
Related papers
- A simple theory for training response of deep neural networks [0.0]
Deep neural networks give us a powerful method to model the training dataset's relationship between input and output.
We show the training response consists of some different factors based on training stages, activation functions, or training methods.
In addition, we show feature space reduction as an effect of training dynamics, which can result in network fragility.
arXiv Detail & Related papers (2024-05-07T07:20:15Z) - Message Passing Variational Autoregressive Network for Solving Intractable Ising Models [6.261096199903392]
Many deep neural networks have been used to solve Ising models, including autoregressive neural networks, convolutional neural networks, recurrent neural networks, and graph neural networks.
Here we propose a variational autoregressive architecture with a message passing mechanism, which can effectively utilize the interactions between spin variables.
The new network trained under an annealing framework outperforms existing methods in solving several prototypical Ising spin Hamiltonians, especially for larger spin systems at low temperatures.
arXiv Detail & Related papers (2024-04-09T11:27:07Z) - Graph Neural Networks for Learning Equivariant Representations of Neural Networks [55.04145324152541]
We propose to represent neural networks as computational graphs of parameters.
Our approach enables a single model to encode neural computational graphs with diverse architectures.
We showcase the effectiveness of our method on a wide range of tasks, including classification and editing of implicit neural representations.
arXiv Detail & Related papers (2024-03-18T18:01:01Z) - Mechanistic Neural Networks for Scientific Machine Learning [58.99592521721158]
We present Mechanistic Neural Networks, a neural network design for machine learning applications in the sciences.
It incorporates a new Mechanistic Block in standard architectures to explicitly learn governing differential equations as representations.
Central to our approach is a novel Relaxed Linear Programming solver (NeuRLP) inspired by a technique that reduces solving linear ODEs to solving linear programs.
arXiv Detail & Related papers (2024-02-20T15:23:24Z) - Simple initialization and parametrization of sinusoidal networks via
their kernel bandwidth [92.25666446274188]
sinusoidal neural networks with activations have been proposed as an alternative to networks with traditional activation functions.
We first propose a simplified version of such sinusoidal neural networks, which allows both for easier practical implementation and simpler theoretical analysis.
We then analyze the behavior of these networks from the neural tangent kernel perspective and demonstrate that their kernel approximates a low-pass filter with an adjustable bandwidth.
arXiv Detail & Related papers (2022-11-26T07:41:48Z) - Vanilla Feedforward Neural Networks as a Discretization of Dynamical Systems [9.382423715831687]
In this paper, we back to the classical network structure and prove that the vanilla feedforward networks could also be a numerical discretization of dynamic systems.
Our results could provide a new perspective for understanding the approximation properties of feedforward neural networks.
arXiv Detail & Related papers (2022-09-22T10:32:08Z) - Improving the Trainability of Deep Neural Networks through Layerwise
Batch-Entropy Regularization [1.3999481573773072]
We introduce and evaluate the batch-entropy which quantifies the flow of information through each layer of a neural network.
We show that we can train a "vanilla" fully connected network and convolutional neural network with 500 layers by simply adding the batch-entropy regularization term to the loss function.
arXiv Detail & Related papers (2022-08-01T20:31:58Z) - Learning Neural Network Subspaces [74.44457651546728]
Recent observations have advanced our understanding of the neural network optimization landscape.
With a similar computational cost as training one model, we learn lines, curves, and simplexes of high-accuracy neural networks.
With a similar computational cost as training one model, we learn lines, curves, and simplexes of high-accuracy neural networks.
arXiv Detail & Related papers (2021-02-20T23:26:58Z) - Reservoir Memory Machines as Neural Computers [70.5993855765376]
Differentiable neural computers extend artificial neural networks with an explicit memory without interference.
We achieve some of the computational capabilities of differentiable neural computers with a model that can be trained very efficiently.
arXiv Detail & Related papers (2020-09-14T12:01:30Z) - It's Hard for Neural Networks To Learn the Game of Life [4.061135251278187]
Recent findings suggest that neural networks rely on lucky random initial weights of "lottery tickets" that converge quickly to a solution.
We examine small convolutional networks that are trained to predict n steps of the two-dimensional cellular automaton Conway's Game of Life.
We find that networks of this architecture trained on this task rarely converge.
arXiv Detail & Related papers (2020-09-03T00:47:08Z) - Training End-to-End Analog Neural Networks with Equilibrium Propagation [64.0476282000118]
We introduce a principled method to train end-to-end analog neural networks by gradient descent.
We show mathematically that a class of analog neural networks (called nonlinear resistive networks) are energy-based models.
Our work can guide the development of a new generation of ultra-fast, compact and low-power neural networks supporting on-chip learning.
arXiv Detail & Related papers (2020-06-02T23:38:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.