Accelerating Natural Gradient Descent for PINNs with Randomized Numerical Linear Algebra
- URL: http://arxiv.org/abs/2505.11638v2
- Date: Tue, 20 May 2025 08:32:03 GMT
- Title: Accelerating Natural Gradient Descent for PINNs with Randomized Numerical Linear Algebra
- Authors: Ivan Bioli, Carlo Marcati, Giancarlo Sangalli,
- Abstract summary: Natural Gradient Descent (NGD) has emerged as a promising optimization algorithm for training neural network-based solvers for partial differential equations (PDEs)<n>We extend matrix-free NGD to broader classes of problems than previously considered and propose the use of Randomized Nystr"om preconditioning to accelerate convergence of the inner CG solver.<n>The resulting algorithm demonstrates substantial performance improvements over existing NGD-based methods on a range of PDE problems discretized using neural networks.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Natural Gradient Descent (NGD) has emerged as a promising optimization algorithm for training neural network-based solvers for partial differential equations (PDEs), such as Physics-Informed Neural Networks (PINNs). However, its practical use is often limited by the high computational cost of solving linear systems involving the Gramian matrix. While matrix-free NGD methods based on the conjugate gradient (CG) method avoid explicit matrix inversion, the ill-conditioning of the Gramian significantly slows the convergence of CG. In this work, we extend matrix-free NGD to broader classes of problems than previously considered and propose the use of Randomized Nystr\"om preconditioning to accelerate convergence of the inner CG solver. The resulting algorithm demonstrates substantial performance improvements over existing NGD-based methods on a range of PDE problems discretized using neural networks.
Related papers
- A Natural Primal-Dual Hybrid Gradient Method for Adversarial Neural Network Training on Solving Partial Differential Equations [9.588717577573684]
We propose a scalable preconditioned primal hybrid gradient algorithm for solving partial differential equations (PDEs)<n>We compare the performance of the proposed method with several commonly used deep learning algorithms.<n>The numerical results suggest that the proposed method performs efficiently and robustly and converges more stably.
arXiv Detail & Related papers (2024-11-09T20:39:10Z) - Convergence of Implicit Gradient Descent for Training Two-Layer Physics-Informed Neural Networks [3.680127959836384]
implicit gradient descent (IGD) outperforms the common gradient descent (GD) in handling certain multi-scale problems.
We show that IGD converges a globally optimal solution at a linear convergence rate.
arXiv Detail & Related papers (2024-07-03T06:10:41Z) - Learning from Linear Algebra: A Graph Neural Network Approach to Preconditioner Design for Conjugate Gradient Solvers [40.6591136324878]
We train GNNs to obtain preconditioners that reduce the condition number of the system more significantly than classical preconditioners.<n>Our approach outperforms both classical and neural network-based methods for an important class of parametric partial differential equations.
arXiv Detail & Related papers (2024-05-24T13:44:30Z) - A Structure-Guided Gauss-Newton Method for Shallow ReLU Neural Network [18.06366638807982]
We propose a structure-guided Gauss-Newton (SgGN) method for solving least squares problems using a shallow ReLU neural network.
The method effectively takes advantage of both the least squares structure and the neural network structure of the objective function.
arXiv Detail & Related papers (2024-04-07T20:24:44Z) - Implicit Stochastic Gradient Descent for Training Physics-informed
Neural Networks [51.92362217307946]
Physics-informed neural networks (PINNs) have effectively been demonstrated in solving forward and inverse differential equation problems.
PINNs are trapped in training failures when the target functions to be approximated exhibit high-frequency or multi-scale features.
In this paper, we propose to employ implicit gradient descent (ISGD) method to train PINNs for improving the stability of training process.
arXiv Detail & Related papers (2023-03-03T08:17:47Z) - AskewSGD : An Annealed interval-constrained Optimisation method to train
Quantized Neural Networks [12.229154524476405]
We develop a new algorithm, Annealed Skewed SGD - AskewSGD - for training deep neural networks (DNNs) with quantized weights.
Unlike algorithms with active sets and feasible directions, AskewSGD avoids projections or optimization under the entire feasible set.
Experimental results show that the AskewSGD algorithm performs better than or on par with state of the art methods in classical benchmarks.
arXiv Detail & Related papers (2022-11-07T18:13:44Z) - Stability and Generalization Analysis of Gradient Methods for Shallow
Neural Networks [59.142826407441106]
We study the generalization behavior of shallow neural networks (SNNs) by leveraging the concept of algorithmic stability.
We consider gradient descent (GD) and gradient descent (SGD) to train SNNs, for both of which we develop consistent excess bounds.
arXiv Detail & Related papers (2022-09-19T18:48:00Z) - Cogradient Descent for Dependable Learning [64.02052988844301]
We propose a dependable learning based on Cogradient Descent (CoGD) algorithm to address the bilinear optimization problem.
CoGD is introduced to solve bilinear problems when one variable is with sparsity constraint.
It can also be used to decompose the association of features and weights, which further generalizes our method to better train convolutional neural networks (CNNs)
arXiv Detail & Related papers (2021-06-20T04:28:20Z) - Hamiltonian Deep Neural Networks Guaranteeing Non-vanishing Gradients by
Design [2.752441514346229]
Vanishing and exploding gradients during weight optimization through backpropagation can be difficult to train.
We propose a general class of Hamiltonian DNNs (H-DNNs) that stem from the discretization of continuous-time Hamiltonian systems.
Our main result is that a broad set of H-DNNs ensures non-vanishing gradients by design for an arbitrary network depth.
The good performance of H-DNNs is demonstrated on benchmark classification problems, including image classification with the MNIST dataset.
arXiv Detail & Related papers (2021-05-27T14:52:22Z) - NTopo: Mesh-free Topology Optimization using Implicit Neural
Representations [35.07884509198916]
We present a novel machine learning approach to tackle topology optimization problems.
We use multilayer perceptrons (MLPs) to parameterize both density and displacement fields.
As we show through our experiments, a major benefit of our approach is that it enables self-supervised learning of continuous solution spaces.
arXiv Detail & Related papers (2021-02-22T05:25:22Z) - Multipole Graph Neural Operator for Parametric Partial Differential
Equations [57.90284928158383]
One of the main challenges in using deep learning-based methods for simulating physical systems is formulating physics-based data.
We propose a novel multi-level graph neural network framework that captures interaction at all ranges with only linear complexity.
Experiments confirm our multi-graph network learns discretization-invariant solution operators to PDEs and can be evaluated in linear time.
arXiv Detail & Related papers (2020-06-16T21:56:22Z) - Cogradient Descent for Bilinear Optimization [124.45816011848096]
We introduce a Cogradient Descent algorithm (CoGD) to address the bilinear problem.
We solve one variable by considering its coupling relationship with the other, leading to a synchronous gradient descent.
Our algorithm is applied to solve problems with one variable under the sparsity constraint.
arXiv Detail & Related papers (2020-06-16T13:41:54Z) - Controllable Orthogonalization in Training DNNs [96.1365404059924]
Orthogonality is widely used for training deep neural networks (DNNs) due to its ability to maintain all singular values of the Jacobian close to 1.
This paper proposes a computationally efficient and numerically stable orthogonalization method using Newton's iteration (ONI)
We show that our method improves the performance of image classification networks by effectively controlling the orthogonality to provide an optimal tradeoff between optimization benefits and representational capacity reduction.
We also show that ONI stabilizes the training of generative adversarial networks (GANs) by maintaining the Lipschitz continuity of a network, similar to spectral normalization (
arXiv Detail & Related papers (2020-04-02T10:14:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.