Related papers: An operator preconditioning perspective on training in physics-informed machine learning

An operator preconditioning perspective on training in physics-informed machine learning

URL: http://arxiv.org/abs/2310.05801v2
Date: Fri, 3 May 2024 10:59:26 GMT
Title: An operator preconditioning perspective on training in physics-informed machine learning
Authors: Tim De Ryck, Florent Bonnet, Siddhartha Mishra, Emmanuel de Bézenac,
Abstract summary: We investigate the behavior of gradient descent algorithms in machine learning methods like PINNs. Our key result is that the difficulty in training these models is closely related to the conditioning of a specific differential operator.
Score: 17.919648902857517
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In this paper, we investigate the behavior of gradient descent algorithms in physics-informed machine learning methods like PINNs, which minimize residuals connected to partial differential equations (PDEs). Our key result is that the difficulty in training these models is closely related to the conditioning of a specific differential operator. This operator, in turn, is associated to the Hermitian square of the differential operator of the underlying PDE. If this operator is ill-conditioned, it results in slow or infeasible training. Therefore, preconditioning this operator is crucial. We employ both rigorous mathematical analysis and empirical evaluations to investigate various strategies, explaining how they better condition this critical operator, and consequently improve training.

Related papers

Efficient Machine Unlearning via Influence Approximation [75.31015485113993]
Influence-based unlearning has emerged as a prominent approach to estimate the impact of individual training samples on model parameters without retraining.<n>This paper establishes a theoretical link between memorizing (incremental learning) and forgetting (unlearning)<n>We introduce the Influence Approximation Unlearning algorithm for efficient machine unlearning from the incremental perspective.
arXiv Detail & Related papers (2025-07-31T05:34:27Z)
Enabling Automatic Differentiation with Mollified Graph Neural Operators [75.3183193262225]
We propose the mollified graph neural operator (mGNO), the first method to leverage automatic differentiation and compute emphexact gradients on arbitrary geometries. For a PDE example on regular grids, mGNO paired with autograd reduced the L2 relative data error by 20x compared to finite differences. It can also solve PDEs on unstructured point clouds seamlessly, using physics losses only, at resolutions vastly lower than those needed for finite differences to be accurate enough.
arXiv Detail & Related papers (2025-04-11T06:16:30Z)
From Theory to Application: A Practical Introduction to Neural Operators in Scientific Computing [0.0]
The study covers foundational models such as Deep Operator Networks (DeepONet) and Principal Component Analysis-based Neural Networks (PCANet) The review delves into applying neural operators as surrogates in Bayesian inference problems, showcasing their effectiveness in accelerating posterior inference while maintaining accuracy. It outlines emerging strategies to address these issues, such as residual-based error correction and multi-level training.
arXiv Detail & Related papers (2025-03-07T17:25:25Z)
DimOL: Dimensional Awareness as A New 'Dimension' in Operator Learning [63.5925701087252]
We introduce DimOL (Dimension-aware Operator Learning), drawing insights from dimensional analysis. To implement DimOL, we propose the ProdLayer, which can be seamlessly integrated into FNO-based and Transformer-based PDE solvers. Empirically, DimOL models achieve up to 48% performance gain within the PDE datasets.
arXiv Detail & Related papers (2024-10-08T10:48:50Z)
Disentangled Representation Learning for Parametric Partial Differential Equations [31.240283037552427]
We propose a new paradigm for learning disentangled representations from neural operator parameters. DisentangO is a novel hyper-neural operator architecture designed to unveil and disentangle the latent physical factors of variation embedded within the black-box neural operator parameters. We show that DisentangO effectively extracts meaningful and interpretable latent features, bridging the divide between predictive performance and physical understanding in neural operator frameworks.
arXiv Detail & Related papers (2024-10-03T01:40:39Z)
DeltaPhi: Learning Physical Trajectory Residual for PDE Solving [54.13671100638092]
We propose and formulate the Physical Trajectory Residual Learning (DeltaPhi) We learn the surrogate model for the residual operator mapping based on existing neural operator networks. We conclude that, compared to direct learning, physical residual learning is preferred for PDE solving.
arXiv Detail & Related papers (2024-06-14T07:45:07Z)
Operator Learning: Algorithms and Analysis [8.305111048568737]
Operator learning refers to the application of ideas from machine learning to approximate operators mapping between Banach spaces of functions. This review focuses on neural operators, built on the success of deep neural networks in the approximation of functions defined on finite dimensional Euclidean spaces.
arXiv Detail & Related papers (2024-02-24T04:40:27Z)
PICL: Physics Informed Contrastive Learning for Partial Differential Equations [7.136205674624813]
We develop a novel contrastive pretraining framework that improves neural operator generalization across multiple governing equations simultaneously. A combination of physics-informed system evolution and latent-space model output are anchored to input data and used in our distance function. We find that physics-informed contrastive pretraining improves accuracy for the Fourier Neural Operator in fixed-future and autoregressive rollout tasks for the 1D and 2D Heat, Burgers', and linear advection equations.
arXiv Detail & Related papers (2024-01-29T17:32:22Z)
Energy-Preserving Reduced Operator Inference for Efficient Design and Control [0.0]
This work presents a physics-preserving reduced model learning approach that targets partial differential equations. EP-OpInf learns efficient and accurate reduced models that retain this energy-preserving structure.
arXiv Detail & Related papers (2024-01-05T16:39:48Z)
Approximate Bayesian Neural Operators: Uncertainty Quantification for Parametric PDEs [34.179984253109346]
We provide a mathematically detailed Bayesian formulation of the ''shallow'' (linear) version of neural operators. We then extend this analytic treatment to general deep neural operators using approximate methods from Bayesian deep learning. As a result, our approach is able to identify cases, and provide structured uncertainty estimates, where the neural operator fails to predict well.
arXiv Detail & Related papers (2022-08-02T16:10:27Z)
Learning Dynamical Systems via Koopman Operator Regression in Reproducing Kernel Hilbert Spaces [52.35063796758121]
We formalize a framework to learn the Koopman operator from finite data trajectories of the dynamical system. We link the risk with the estimation of the spectral decomposition of the Koopman operator. Our results suggest RRR might be beneficial over other widely used estimators.
arXiv Detail & Related papers (2022-05-27T14:57:48Z)
Neural Operator: Learning Maps Between Function Spaces [75.93843876663128]
We propose a generalization of neural networks to learn operators, termed neural operators, that map between infinite dimensional function spaces. We prove a universal approximation theorem for our proposed neural operator, showing that it can approximate any given nonlinear continuous operator. An important application for neural operators is learning surrogate maps for the solution operators of partial differential equations.
arXiv Detail & Related papers (2021-08-19T03:56:49Z)
Tesseract: Tensorised Actors for Multi-Agent Reinforcement Learning [92.05556163518999]
MARL exacerbates matters by imposing various constraints on communication and observability. For value-based methods, it poses challenges in accurately representing the optimal value function. For policy gradient methods, it makes training the critic difficult and exacerbates the problem of the lagging critic. We show that from a learning theory perspective, both problems can be addressed by accurately representing the associated action-value function.
arXiv Detail & Related papers (2021-05-31T23:08:05Z)
Differentiable Top-k Operator with Optimal Transport [135.36099648554054]
The SOFT top-k operator approximates the output of the top-k operation as the solution of an Entropic Optimal Transport (EOT) problem. We apply the proposed operator to the k-nearest neighbors and beam search algorithms, and demonstrate improved performance.
arXiv Detail & Related papers (2020-02-16T04:57:52Z)

This list is automatically generated from the titles and abstracts of the papers in this site.