Related papers: PirateNets: Physics-informed Deep Learning with Residual Adaptive Networks

PirateNets: Physics-informed Deep Learning with Residual Adaptive Networks

URL: http://arxiv.org/abs/2402.00326v3
Date: Sun, 11 Feb 2024 22:53:59 GMT
Title: PirateNets: Physics-informed Deep Learning with Residual Adaptive Networks
Authors: Sifan Wang, Bowen Li, Yuhan Chen, Paris Perdikaris
Abstract summary: We introduce Physics-informed Residual Adaptive Networks (PirateNets) to facilitate stable and efficient training of deep PINN models. PirateNets leverage a novel adaptive residual connection, which allows the networks to be as shallow networks that progressively deepen during training. We show that PirateNets are easier to optimize and can gain accuracy from considerably increased depth, ultimately achieving state-of-the-art results across various benchmarks.
Score: 19.519831541375144
License: http://creativecommons.org/licenses/by-nc-nd/4.0/
Abstract: While physics-informed neural networks (PINNs) have become a popular deep learning framework for tackling forward and inverse problems governed by partial differential equations (PDEs), their performance is known to degrade when larger and deeper neural network architectures are employed. Our study identifies that the root of this counter-intuitive behavior lies in the use of multi-layer perceptron (MLP) architectures with non-suitable initialization schemes, which result in poor trainablity for the network derivatives, and ultimately lead to an unstable minimization of the PDE residual loss. To address this, we introduce Physics-informed Residual Adaptive Networks (PirateNets), a novel architecture that is designed to facilitate stable and efficient training of deep PINN models. PirateNets leverage a novel adaptive residual connection, which allows the networks to be initialized as shallow networks that progressively deepen during training. We also show that the proposed initialization scheme allows us to encode appropriate inductive biases corresponding to a given PDE system into the network architecture. We provide comprehensive empirical evidence showing that PirateNets are easier to optimize and can gain accuracy from considerably increased depth, ultimately achieving state-of-the-art results across various benchmarks. All code and data accompanying this manuscript will be made publicly available at \url{https://github.com/PredictiveIntelligenceLab/jaxpi}.

Related papers

Network Sparsity Unlocks the Scaling Potential of Deep Reinforcement Learning [57.3885832382455]
We show that introducing static network sparsity alone can unlock further scaling potential beyond dense counterparts with state-of-the-art architectures.<n>Our analysis reveals that, in contrast to naively scaling up dense DRL networks, such sparse networks achieve both higher parameter efficiency for network expressivity.
arXiv Detail & Related papers (2025-06-20T17:54:24Z)
GradINN: Gradient Informed Neural Network [2.287415292857564]
We propose a methodology inspired by Physics Informed Neural Networks (PINNs) GradINNs leverage prior beliefs about a system's gradient to constrain the predicted function's gradient across all input dimensions. We demonstrate the advantages of GradINNs, particularly in low-data regimes, on diverse problems spanning non time-dependent systems.
arXiv Detail & Related papers (2024-09-03T14:03:29Z)
NEPENTHE: Entropy-Based Pruning as a Neural Network Depth's Reducer [5.373015313199385]
We propose an eNtropy-basEd Pruning as a nEural Network depTH's rEducer to alleviate deep neural networks' computational burden. We validate our approach on popular architectures such as MobileNet and Swin-T.
arXiv Detail & Related papers (2024-04-24T09:12:04Z)
Principled Architecture-aware Scaling of Hyperparameters [69.98414153320894]
Training a high-quality deep neural network requires choosing suitable hyperparameters, which is a non-trivial and expensive process. In this work, we precisely characterize the dependence of initializations and maximal learning rates on the network architecture. We demonstrate that network rankings can be easily changed by better training networks in benchmarks.
arXiv Detail & Related papers (2024-02-27T11:52:49Z)
Iterative Soft Shrinkage Learning for Efficient Image Super-Resolution [91.3781512926942]
Image super-resolution (SR) has witnessed extensive neural network designs from CNN to transformer architectures. This work investigates the potential of network pruning for super-resolution iteration to take advantage of off-the-shelf network designs and reduce the underlying computational overhead. We propose a novel Iterative Soft Shrinkage-Percentage (ISS-P) method by optimizing the sparse structure of a randomly network at each and tweaking unimportant weights with a small amount proportional to the magnitude scale on-the-fly.
arXiv Detail & Related papers (2023-03-16T21:06:13Z)
Dynamic Network Reconfiguration for Entropy Maximization using Deep Reinforcement Learning [3.012947865628207]
Key problem in network theory is how to reconfigure a graph in order to optimize a quantifiable objective. In this paper, we cast the problem of network rewiring for optimizing a specified structural property as a Markov Decision Process (MDP) We then propose a general approach based on the Deep Q-Network (DQN) algorithm and graph neural networks (GNNs) that can efficiently learn strategies for rewiring networks.
arXiv Detail & Related papers (2022-05-26T18:44:22Z)
Singular Value Perturbation and Deep Network Optimization [29.204852309828006]
We develop new theoretical results on matrix perturbation to shed light on the impact of architecture on the performance of a deep network. In particular, we explain what deep learning practitioners have long observed empirically: the parameters of some deep architectures are easier to optimize than others. A direct application of our perturbation results explains analytically why a ResNet is easier to optimize than a ConvNet.
arXiv Detail & Related papers (2022-03-07T02:09:39Z)
Analytically Tractable Inference in Deep Neural Networks [0.0]
Tractable Approximate Inference (TAGI) algorithm was shown to be a viable and scalable alternative to backpropagation for shallow fully-connected neural networks. We are demonstrating how TAGI matches or exceeds the performance of backpropagation, for training classic deep neural network architectures.
arXiv Detail & Related papers (2021-03-09T14:51:34Z)
Kernel-Based Smoothness Analysis of Residual Networks [85.20737467304994]
Residual networks (ResNets) stand out among these powerful modern architectures. In this paper, we show another distinction between the two models, namely, a tendency of ResNets to promote smoothers than gradients.
arXiv Detail & Related papers (2020-09-21T16:32:04Z)
Dynamic Hierarchical Mimicking Towards Consistent Optimization Objectives [73.15276998621582]
We propose a generic feature learning mechanism to advance CNN training with enhanced generalization ability. Partially inspired by DSN, we fork delicately designed side branches from the intermediate layers of a given neural network. Experiments on both category and instance recognition tasks demonstrate the substantial improvements of our proposed method.
arXiv Detail & Related papers (2020-03-24T09:56:13Z)
Deep Learning for Ultra-Reliable and Low-Latency Communications in 6G Networks [84.2155885234293]
We first summarize how to apply data-driven supervised deep learning and deep reinforcement learning in URLLC. To address these open problems, we develop a multi-level architecture that enables device intelligence, edge intelligence, and cloud intelligence for URLLC.
arXiv Detail & Related papers (2020-02-22T14:38:11Z)
Large-Scale Gradient-Free Deep Learning with Recursive Local Representation Alignment [84.57874289554839]
Training deep neural networks on large-scale datasets requires significant hardware resources. Backpropagation, the workhorse for training these networks, is an inherently sequential process that is difficult to parallelize. We propose a neuro-biologically-plausible alternative to backprop that can be used to train deep networks.
arXiv Detail & Related papers (2020-02-10T16:20:02Z)

This list is automatically generated from the titles and abstracts of the papers in this site.