Reduced Order Modeling using Shallow ReLU Networks with Grassmann Layers
- URL: http://arxiv.org/abs/2012.09940v1
- Date: Thu, 17 Dec 2020 21:35:06 GMT
- Title: Reduced Order Modeling using Shallow ReLU Networks with Grassmann Layers
- Authors: Kayla Bollinger and Hayden Schaeffer
- Abstract summary: This paper presents a nonlinear model reduction method for systems of equations using a structured neural network.
We show that our method can be applied to scientific problems in the data-scarce regime, which is typically not well-suited for neural network approximations.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This paper presents a nonlinear model reduction method for systems of
equations using a structured neural network. The neural network takes the form
of a "three-layer" network with the first layer constrained to lie on the
Grassmann manifold and the first activation function set to identity, while the
remaining network is a standard two-layer ReLU neural network. The Grassmann
layer determines the reduced basis for the input space, while the remaining
layers approximate the nonlinear input-output system. The training alternates
between learning the reduced basis and the nonlinear approximation, and is
shown to be more effective than fixing the reduced basis and training the
network only. An additional benefit of this approach is, for data that lie on
low-dimensional subspaces, that the number of parameters in the network does
not need to be large. We show that our method can be applied to scientific
problems in the data-scarce regime, which is typically not well-suited for
neural network approximations. Examples include reduced order modeling for
nonlinear dynamical systems and several aerospace engineering problems.
Related papers
- GradINN: Gradient Informed Neural Network [2.287415292857564]
We propose a methodology inspired by Physics Informed Neural Networks (PINNs)
GradINNs leverage prior beliefs about a system's gradient to constrain the predicted function's gradient across all input dimensions.
We demonstrate the advantages of GradINNs, particularly in low-data regimes, on diverse problems spanning non time-dependent systems.
arXiv Detail & Related papers (2024-09-03T14:03:29Z) - Scalable Bayesian Inference in the Era of Deep Learning: From Gaussian Processes to Deep Neural Networks [0.5827521884806072]
Large neural networks trained on large datasets have become the dominant paradigm in machine learning.
This thesis develops scalable methods to equip neural networks with model uncertainty.
arXiv Detail & Related papers (2024-04-29T23:38:58Z) - Layer-wise Linear Mode Connectivity [52.6945036534469]
Averaging neural network parameters is an intuitive method for the knowledge of two independent models.
It is most prominently used in federated learning.
We analyse the performance of the models that result from averaging single, or groups.
arXiv Detail & Related papers (2023-07-13T09:39:10Z) - ReLU Neural Networks with Linear Layers are Biased Towards Single- and Multi-Index Models [9.96121040675476]
This manuscript explores how properties of functions learned by neural networks of depth greater than two layers affect predictions.
Our framework considers a family of networks of varying depths that all have the same capacity but different representation costs.
arXiv Detail & Related papers (2023-05-24T22:10:12Z) - Globally Optimal Training of Neural Networks with Threshold Activation
Functions [63.03759813952481]
We study weight decay regularized training problems of deep neural networks with threshold activations.
We derive a simplified convex optimization formulation when the dataset can be shattered at a certain layer of the network.
arXiv Detail & Related papers (2023-03-06T18:59:13Z) - Adversarial Examples Exist in Two-Layer ReLU Networks for Low
Dimensional Linear Subspaces [24.43191276129614]
We show that standard methods lead to non-robust neural networks.
We show that decreasing the scale of the training algorithm, or adding $L$ regularization, can make the trained network more robust to adversarial perturbations.
arXiv Detail & Related papers (2023-03-01T19:10:05Z) - Implicit Bias in Leaky ReLU Networks Trained on High-Dimensional Data [63.34506218832164]
In this work, we investigate the implicit bias of gradient flow and gradient descent in two-layer fully-connected neural networks with ReLU activations.
For gradient flow, we leverage recent work on the implicit bias for homogeneous neural networks to show that leakyally, gradient flow produces a neural network with rank at most two.
For gradient descent, provided the random variance is small enough, we show that a single step of gradient descent suffices to drastically reduce the rank of the network, and that the rank remains small throughout training.
arXiv Detail & Related papers (2022-10-13T15:09:54Z) - Subquadratic Overparameterization for Shallow Neural Networks [60.721751363271146]
We provide an analytical framework that allows us to adopt standard neural training strategies.
We achieve the desiderata viaak-Lojasiewicz, smoothness, and standard assumptions.
arXiv Detail & Related papers (2021-11-02T20:24:01Z) - Non-Gradient Manifold Neural Network [79.44066256794187]
Deep neural network (DNN) generally takes thousands of iterations to optimize via gradient descent.
We propose a novel manifold neural network based on non-gradient optimization.
arXiv Detail & Related papers (2021-06-15T06:39:13Z) - Over-parametrized neural networks as under-determined linear systems [31.69089186688224]
We show that it is unsurprising simple neural networks can achieve zero training loss.
We show that kernels typically associated with the ReLU activation function have fundamental flaws.
We propose new activation functions that avoid the pitfalls of ReLU in that they admit zero training loss solutions for any set of distinct data points.
arXiv Detail & Related papers (2020-10-29T21:43:00Z) - Modeling from Features: a Mean-field Framework for Over-parameterized
Deep Neural Networks [54.27962244835622]
This paper proposes a new mean-field framework for over- parameterized deep neural networks (DNNs)
In this framework, a DNN is represented by probability measures and functions over its features in the continuous limit.
We illustrate the framework via the standard DNN and the Residual Network (Res-Net) architectures.
arXiv Detail & Related papers (2020-07-03T01:37:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.