PirateNets: Physics-informed Deep Learning with Residual Adaptive
Networks
- URL: http://arxiv.org/abs/2402.00326v3
- Date: Sun, 11 Feb 2024 22:53:59 GMT
- Title: PirateNets: Physics-informed Deep Learning with Residual Adaptive
Networks
- Authors: Sifan Wang, Bowen Li, Yuhan Chen, Paris Perdikaris
- Abstract summary: We introduce Physics-informed Residual Adaptive Networks (PirateNets) to facilitate stable and efficient training of deep PINN models.
PirateNets leverage a novel adaptive residual connection, which allows the networks to be as shallow networks that progressively deepen during training.
We show that PirateNets are easier to optimize and can gain accuracy from considerably increased depth, ultimately achieving state-of-the-art results across various benchmarks.
- Score: 19.519831541375144
- License: http://creativecommons.org/licenses/by-nc-nd/4.0/
- Abstract: While physics-informed neural networks (PINNs) have become a popular deep
learning framework for tackling forward and inverse problems governed by
partial differential equations (PDEs), their performance is known to degrade
when larger and deeper neural network architectures are employed. Our study
identifies that the root of this counter-intuitive behavior lies in the use of
multi-layer perceptron (MLP) architectures with non-suitable initialization
schemes, which result in poor trainablity for the network derivatives, and
ultimately lead to an unstable minimization of the PDE residual loss. To
address this, we introduce Physics-informed Residual Adaptive Networks
(PirateNets), a novel architecture that is designed to facilitate stable and
efficient training of deep PINN models. PirateNets leverage a novel adaptive
residual connection, which allows the networks to be initialized as shallow
networks that progressively deepen during training. We also show that the
proposed initialization scheme allows us to encode appropriate inductive biases
corresponding to a given PDE system into the network architecture. We provide
comprehensive empirical evidence showing that PirateNets are easier to optimize
and can gain accuracy from considerably increased depth, ultimately achieving
state-of-the-art results across various benchmarks. All code and data
accompanying this manuscript will be made publicly available at
\url{https://github.com/PredictiveIntelligenceLab/jaxpi}.
Related papers
- GradINN: Gradient Informed Neural Network [2.287415292857564]
We propose a methodology inspired by Physics Informed Neural Networks (PINNs)
GradINNs leverage prior beliefs about a system's gradient to constrain the predicted function's gradient across all input dimensions.
We demonstrate the advantages of GradINNs, particularly in low-data regimes, on diverse problems spanning non time-dependent systems.
arXiv Detail & Related papers (2024-09-03T14:03:29Z) - NEPENTHE: Entropy-Based Pruning as a Neural Network Depth's Reducer [5.373015313199385]
We propose an eNtropy-basEd Pruning as a nEural Network depTH's rEducer to alleviate deep neural networks' computational burden.
We validate our approach on popular architectures such as MobileNet and Swin-T.
arXiv Detail & Related papers (2024-04-24T09:12:04Z) - Principled Architecture-aware Scaling of Hyperparameters [69.98414153320894]
Training a high-quality deep neural network requires choosing suitable hyperparameters, which is a non-trivial and expensive process.
In this work, we precisely characterize the dependence of initializations and maximal learning rates on the network architecture.
We demonstrate that network rankings can be easily changed by better training networks in benchmarks.
arXiv Detail & Related papers (2024-02-27T11:52:49Z) - Iterative Soft Shrinkage Learning for Efficient Image Super-Resolution [91.3781512926942]
Image super-resolution (SR) has witnessed extensive neural network designs from CNN to transformer architectures.
This work investigates the potential of network pruning for super-resolution iteration to take advantage of off-the-shelf network designs and reduce the underlying computational overhead.
We propose a novel Iterative Soft Shrinkage-Percentage (ISS-P) method by optimizing the sparse structure of a randomly network at each and tweaking unimportant weights with a small amount proportional to the magnitude scale on-the-fly.
arXiv Detail & Related papers (2023-03-16T21:06:13Z) - Dynamic Network Reconfiguration for Entropy Maximization using Deep
Reinforcement Learning [3.012947865628207]
Key problem in network theory is how to reconfigure a graph in order to optimize a quantifiable objective.
In this paper, we cast the problem of network rewiring for optimizing a specified structural property as a Markov Decision Process (MDP)
We then propose a general approach based on the Deep Q-Network (DQN) algorithm and graph neural networks (GNNs) that can efficiently learn strategies for rewiring networks.
arXiv Detail & Related papers (2022-05-26T18:44:22Z) - Singular Value Perturbation and Deep Network Optimization [29.204852309828006]
We develop new theoretical results on matrix perturbation to shed light on the impact of architecture on the performance of a deep network.
In particular, we explain what deep learning practitioners have long observed empirically: the parameters of some deep architectures are easier to optimize than others.
A direct application of our perturbation results explains analytically why a ResNet is easier to optimize than a ConvNet.
arXiv Detail & Related papers (2022-03-07T02:09:39Z) - Analytically Tractable Inference in Deep Neural Networks [0.0]
Tractable Approximate Inference (TAGI) algorithm was shown to be a viable and scalable alternative to backpropagation for shallow fully-connected neural networks.
We are demonstrating how TAGI matches or exceeds the performance of backpropagation, for training classic deep neural network architectures.
arXiv Detail & Related papers (2021-03-09T14:51:34Z) - Kernel-Based Smoothness Analysis of Residual Networks [85.20737467304994]
Residual networks (ResNets) stand out among these powerful modern architectures.
In this paper, we show another distinction between the two models, namely, a tendency of ResNets to promote smoothers than gradients.
arXiv Detail & Related papers (2020-09-21T16:32:04Z) - Dynamic Hierarchical Mimicking Towards Consistent Optimization
Objectives [73.15276998621582]
We propose a generic feature learning mechanism to advance CNN training with enhanced generalization ability.
Partially inspired by DSN, we fork delicately designed side branches from the intermediate layers of a given neural network.
Experiments on both category and instance recognition tasks demonstrate the substantial improvements of our proposed method.
arXiv Detail & Related papers (2020-03-24T09:56:13Z) - Deep Learning for Ultra-Reliable and Low-Latency Communications in 6G
Networks [84.2155885234293]
We first summarize how to apply data-driven supervised deep learning and deep reinforcement learning in URLLC.
To address these open problems, we develop a multi-level architecture that enables device intelligence, edge intelligence, and cloud intelligence for URLLC.
arXiv Detail & Related papers (2020-02-22T14:38:11Z) - Large-Scale Gradient-Free Deep Learning with Recursive Local
Representation Alignment [84.57874289554839]
Training deep neural networks on large-scale datasets requires significant hardware resources.
Backpropagation, the workhorse for training these networks, is an inherently sequential process that is difficult to parallelize.
We propose a neuro-biologically-plausible alternative to backprop that can be used to train deep networks.
arXiv Detail & Related papers (2020-02-10T16:20:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.