Related papers: Optimizing the Optimizer for Physics-Informed Neural Networks and Kolmogorov-Arnold Networks

Optimizing the Optimizer for Physics-Informed Neural Networks and Kolmogorov-Arnold Networks

URL: http://arxiv.org/abs/2501.16371v5
Date: Sun, 24 Aug 2025 04:23:02 GMT
Title: Optimizing the Optimizer for Physics-Informed Neural Networks and Kolmogorov-Arnold Networks
Authors: Elham Kiyani, Khemraj Shukla, Jorge F. Urbán, Jérôme Darbon, George Em Karniadakis,
Abstract summary: Physics-Informed Neural Networks (PINNs) have revolutionized the computation PDE solutions by integrating partialmagnitude equations (PDEs) into the neural network's training process as soft constraints.<n>More, physics-informed networks (PIKANs) have also been effective and comparable in accuracy.
Score: 3.758814046658822
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: Physics-Informed Neural Networks (PINNs) have revolutionized the computation of PDE solutions by integrating partial differential equations (PDEs) into the neural network's training process as soft constraints, becoming an important component of the scientific machine learning (SciML) ecosystem. More recently, physics-informed Kolmogorv-Arnold networks (PIKANs) have also shown to be effective and comparable in accuracy with PINNs. In their current implementation, both PINNs and PIKANs are mainly optimized using first-order methods like Adam, as well as quasi-Newton methods such as BFGS and its low-memory variant, L-BFGS. However, these optimizers often struggle with highly non-linear and non-convex loss landscapes, leading to challenges such as slow convergence, local minima entrapment, and (non)degenerate saddle points. In this study, we investigate the performance of Self-Scaled BFGS (SSBFGS), Self-Scaled Broyden (SSBroyden) methods and other advanced quasi-Newton schemes, including BFGS and L-BFGS with different line search strategies. These methods dynamically rescale updates based on historical gradient information, thus enhancing training efficiency and accuracy. We systematically compare these optimizers using both PINNs and PIKANs on key challenging PDEs, including the Burgers, Allen-Cahn, Kuramoto-Sivashinsky, Ginzburg-Landau, and Stokes equations. Additionally, we evaluate the performance of SSBFGS and SSBroyden for Deep Operator Network (DeepONet) architectures, demonstrating their effectiveness for data-driven operator learning. Our findings provide state-of-the-art results with orders-of-magnitude accuracy improvements without the use of adaptive weights or any other enhancements typically employed in PINNs.

Related papers

Do physics-informed neural networks (PINNs) need to be deep? Shallow PINNs using the Levenberg-Marquardt algorithm [0.0]
This work investigates the use of shallow physics-informed neural networks (PINNs) for solving forward and inverse problems of nonlinear partial differential equations (PDEs)<n>The proposed approach is tested on several benchmark problems, including the Burgers, Schrdinger, Allen-Cahn, and three-dimensional Bratu equations.
arXiv Detail & Related papers (2026-02-09T11:05:57Z)
Efficient Fault Detection in WSN Based on PCA-Optimized Deep Neural Network Slicing Trained with GOA [0.6827423171182154]
Traditional fault detection methods often struggle with optimizing deep neural networks (DNNs) for efficient performance.<n>This study proposes a novel hybrid method combining Principal Component Analysis (PCA) with a DNN optimized by the Grasshopper Optimization Algorithm (GOA) to address these limitations.<n>Our approach achieves a remarkable 99.72% classification accuracy, with exceptional precision and recall, outperforming conventional methods.
arXiv Detail & Related papers (2025-05-11T15:51:56Z)
DeepONet Augmented by Randomized Neural Networks for Efficient Operator Learning in PDEs [5.84093922354671]
We propose RaNN-DeepONets, a hybrid architecture designed to balance accuracy and efficiency.<n>RaNN-DeepONets achieves comparable accuracy while reducing computational costs by orders of magnitude.<n>These results highlight the potential of RaNN-DeepONets as an efficient alternative for operator learning in PDE-based systems.
arXiv Detail & Related papers (2025-03-01T03:05:29Z)
DiffGrad for Physics-Informed Neural Networks [0.0]
Burgers' equation, a fundamental equation in fluid dynamics that is extensively used in PINNs, provides flexible results with the Adamprop. This paper introduces a novel strategy for solving Burgers' equation by incorporating DiffGrad with PINNs.
arXiv Detail & Related papers (2024-09-05T04:39:35Z)
Physics Informed Kolmogorov-Arnold Neural Networks for Dynamical Analysis via Efficent-KAN and WAV-KAN [0.12045539806824918]
We implement the Physics-Informed Kolmogorov-Arnold Neural Networks (PIKAN) through efficient-KAN and WAV-KAN. PIKAN demonstrates superior performance compared to conventional deep neural networks, achieving the same level of accuracy with fewer layers and reduced computational overhead.
arXiv Detail & Related papers (2024-07-25T20:14:58Z)
Adaptive Training of Grid-Dependent Physics-Informed Kolmogorov-Arnold Networks [4.216184112447278]
Physics-Informed Neural Networks (PINNs) have emerged as a robust framework for solving Partial Differential Equations (PDEs) We present a fast JAX-based implementation of grid-dependent Physics-Informed Kolmogorov-Arnold Networks (PIKANs) for solving PDEs. We demonstrate that the adaptive features significantly enhance solution accuracy, decreasing the L2 error relative to the reference solution by up to 43.02%.
arXiv Detail & Related papers (2024-07-24T19:55:08Z)
Convergence of Implicit Gradient Descent for Training Two-Layer Physics-Informed Neural Networks [4.554284689395686]
implicit gradient descent (IGD) outperforms the common gradient descent (GD) algorithm in handling certain multi-scale problems.<n>We show that IGD converges to a globally optimal solution at a linear convergence rate.
arXiv Detail & Related papers (2024-07-03T06:10:41Z)
DFA-GNN: Forward Learning of Graph Neural Networks by Direct Feedback Alignment [57.62885438406724]
Graph neural networks are recognized for their strong performance across various applications. BP has limitations that challenge its biological plausibility and affect the efficiency, scalability and parallelism of training neural networks for graph-based tasks. We propose DFA-GNN, a novel forward learning framework tailored for GNNs with a case study of semi-supervised learning.
arXiv Detail & Related papers (2024-06-04T07:24:51Z)
RoPINN: Region Optimized Physics-Informed Neural Networks [66.38369833561039]
Physics-informed neural networks (PINNs) have been widely applied to solve partial differential equations (PDEs) This paper proposes and theoretically studies a new training paradigm as region optimization. A practical training algorithm, Region Optimized PINN (RoPINN), is seamlessly derived from this new paradigm.
arXiv Detail & Related papers (2024-05-23T09:45:57Z)
Densely Multiplied Physics Informed Neural Networks [1.8554335256160261]
physics-informed neural networks (PINNs) have shown great potential in dealing with nonlinear partial differential equations (PDEs) This paper improves the neural network architecture to improve the performance of PINN. We propose a densely multiply PINN (DM-PINN) architecture, which multiplies the output of a hidden layer with the outputs of all the behind hidden layers.
arXiv Detail & Related papers (2024-02-06T20:45:31Z)
Challenges in Training PINNs: A Loss Landscape Perspective [16.89714536706181]
This paper explores challenges in training Physics-Informed Neural Networks (PINNs) We examine difficulties in minimizing the PINN loss function, particularly due to ill-conditioning caused by differential operators in the residual term. We introduce a novel second-order gradient, NysNewton-CG (NNCG), which significantly improves PINN performance.
arXiv Detail & Related papers (2024-02-02T19:46:43Z)
A Gaussian Process Framework for Solving Forward and Inverse Problems Involving Nonlinear Partial Differential Equations [0.0]
We introduce kernel-weighted Corrective Residuals (CoRes) to integrate the strengths of kernel methods and deep NNs for solving nonlinear PDE systems. CoRes consistently outperforms competing methods in solving a broad range of benchmark problems. We believe our findings have the potential to spark a renewed interest in leveraging kernel methods for solving PDEs.
arXiv Detail & Related papers (2024-01-07T14:09:42Z)
Efficient and Flexible Neural Network Training through Layer-wise Feedback Propagation [49.44309457870649]
We present Layer-wise Feedback Propagation (LFP), a novel training principle for neural network-like predictors. LFP decomposes a reward to individual neurons based on their respective contributions to solving a given task. Our method then implements a greedy approach reinforcing helpful parts of the network and weakening harmful ones.
arXiv Detail & Related papers (2023-08-23T10:48:28Z)
Implicit Stochastic Gradient Descent for Training Physics-informed Neural Networks [51.92362217307946]
Physics-informed neural networks (PINNs) have effectively been demonstrated in solving forward and inverse differential equation problems. PINNs are trapped in training failures when the target functions to be approximated exhibit high-frequency or multi-scale features. In this paper, we propose to employ implicit gradient descent (ISGD) method to train PINNs for improving the stability of training process.
arXiv Detail & Related papers (2023-03-03T08:17:47Z)
RBF-MGN:Solving spatiotemporal PDEs with Physics-informed Graph Neural Network [4.425915683879297]
We propose a novel framework based on graph neural networks (GNNs) and radial basis function finite difference (RBF-FD) RBF-FD is used to construct a high-precision difference format of the differential equations to guide model training. We illustrate the generalizability, accuracy, and efficiency of the proposed algorithms on different PDE parameters.
arXiv Detail & Related papers (2022-12-06T10:08:02Z)
Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency. We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z)
Recurrent Bilinear Optimization for Binary Neural Networks [58.972212365275595]
BNNs neglect the intrinsic bilinear relationship of real-valued weights and scale factors. Our work is the first attempt to optimize BNNs from the bilinear perspective. We obtain robust RBONNs, which show impressive performance over state-of-the-art BNNs on various models and datasets.
arXiv Detail & Related papers (2022-09-04T06:45:33Z)
Auto-PINN: Understanding and Optimizing Physics-Informed Neural Architecture [77.59766598165551]
Physics-informed neural networks (PINNs) are revolutionizing science and engineering practice by bringing together the power of deep learning to bear on scientific computation. Here, we propose Auto-PINN, which employs Neural Architecture Search (NAS) techniques to PINN design. A comprehensive set of pre-experiments using standard PDE benchmarks allows us to probe the structure-performance relationship in PINNs.
arXiv Detail & Related papers (2022-05-27T03:24:31Z)
Revisiting PINNs: Generative Adversarial Physics-informed Neural Networks and Point-weighting Method [70.19159220248805]
Physics-informed neural networks (PINNs) provide a deep learning framework for numerically solving partial differential equations (PDEs) We propose the generative adversarial neural network (GA-PINN), which integrates the generative adversarial (GA) mechanism with the structure of PINNs. Inspired from the weighting strategy of the Adaboost method, we then introduce a point-weighting (PW) method to improve the training efficiency of PINNs.
arXiv Detail & Related papers (2022-05-18T06:50:44Z)
Learning Physics-Informed Neural Networks without Stacked Back-propagation [82.26566759276105]
We develop a novel approach that can significantly accelerate the training of Physics-Informed Neural Networks. In particular, we parameterize the PDE solution by the Gaussian smoothed model and show that, derived from Stein's Identity, the second-order derivatives can be efficiently calculated without back-propagation. Experimental results show that our proposed method can achieve competitive error compared to standard PINN training but is two orders of magnitude faster.
arXiv Detail & Related papers (2022-02-18T18:07:54Z)
Communication-Efficient Distributed Stochastic AUC Maximization with Deep Neural Networks [50.42141893913188]
We study a distributed variable for large-scale AUC for a neural network as with a deep neural network. Our model requires a much less number of communication rounds and still a number of communication rounds in theory. Our experiments on several datasets show the effectiveness of our theory and also confirm our theory.
arXiv Detail & Related papers (2020-05-05T18:08:23Z)

This list is automatically generated from the titles and abstracts of the papers in this site.