Improved Operator Learning by Orthogonal Attention
- URL: http://arxiv.org/abs/2310.12487v3
- Date: Thu, 4 Jul 2024 07:20:40 GMT
- Title: Improved Operator Learning by Orthogonal Attention
- Authors: Zipeng Xiao, Zhongkai Hao, Bokai Lin, Zhijie Deng, Hang Su,
- Abstract summary: We develop an attention based on the eigendecomposition of the kernel integral operator and the neural approximation of eigenfunctions.
Our method can outperform competing baselines with decent margins.
- Score: 17.394770071994145
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Neural operators, as an efficient surrogate model for learning the solutions of PDEs, have received extensive attention in the field of scientific machine learning. Among them, attention-based neural operators have become one of the mainstreams in related research. However, existing approaches overfit the limited training data due to the considerable number of parameters in the attention mechanism. To address this, we develop an orthogonal attention based on the eigendecomposition of the kernel integral operator and the neural approximation of eigenfunctions. The orthogonalization naturally poses a proper regularization effect on the resulting neural operator, which aids in resisting overfitting and boosting generalization. Experiments on six standard neural operator benchmark datasets comprising both regular and irregular geometries show that our method can outperform competing baselines with decent margins.
Related papers
- Towards Gaussian Process for operator learning: an uncertainty aware resolution independent operator learning algorithm for computational mechanics [8.528817025440746]
This paper introduces a novel Gaussian Process (GP) based neural operator for solving parametric differential equations.
We propose a neural operator-embedded kernel'' wherein the GP kernel is formulated in the latent space learned using a neural operator.
Our results highlight the efficacy of this framework in solving complex PDEs while maintaining robustness in uncertainty estimation.
arXiv Detail & Related papers (2024-09-17T08:12:38Z) - Nonlocal Attention Operator: Materializing Hidden Knowledge Towards Interpretable Physics Discovery [25.75410883895742]
We propose a novel neural operator architecture based on the attention mechanism, which we coin Nonlocal Attention Operator (NAO)
NAO can address ill-posedness and rank deficiency in inverse PDE problems by encoding regularization and achieving generalizability.
arXiv Detail & Related papers (2024-08-14T05:57:56Z) - Operator Learning: Algorithms and Analysis [8.305111048568737]
Operator learning refers to the application of ideas from machine learning to approximate operators mapping between Banach spaces of functions.
This review focuses on neural operators, built on the success of deep neural networks in the approximation of functions defined on finite dimensional Euclidean spaces.
arXiv Detail & Related papers (2024-02-24T04:40:27Z) - Function-Space Regularization in Neural Networks: A Probabilistic
Perspective [51.133793272222874]
We show that we can derive a well-motivated regularization technique that allows explicitly encoding information about desired predictive functions into neural network training.
We evaluate the utility of this regularization technique empirically and demonstrate that the proposed method leads to near-perfect semantic shift detection and highly-calibrated predictive uncertainty estimates.
arXiv Detail & Related papers (2023-12-28T17:50:56Z) - Benign Overfitting in Deep Neural Networks under Lazy Training [72.28294823115502]
We show that when the data distribution is well-separated, DNNs can achieve Bayes-optimal test error for classification.
Our results indicate that interpolating with smoother functions leads to better generalization.
arXiv Detail & Related papers (2023-05-30T19:37:44Z) - Learning Neural Eigenfunctions for Unsupervised Semantic Segmentation [12.91586050451152]
Spectral clustering is a theoretically grounded solution to it where the spectral embeddings for pixels are computed to construct distinct clusters.
Current approaches still suffer from inefficiencies in spectral decomposition and inflexibility in applying them to the test data.
This work addresses these issues by casting spectral clustering as a parametric approach that employs neural network-based eigenfunctions to produce spectral embeddings.
In practice, the neural eigenfunctions are lightweight and take the features from pre-trained models as inputs, improving training efficiency and unleashing the potential of pre-trained models for dense prediction.
arXiv Detail & Related papers (2023-04-06T03:14:15Z) - TANGOS: Regularizing Tabular Neural Networks through Gradient
Orthogonalization and Specialization [69.80141512683254]
We introduce Tabular Neural Gradient Orthogonalization and gradient (TANGOS)
TANGOS is a novel framework for regularization in the tabular setting built on latent unit attributions.
We demonstrate that our approach can lead to improved out-of-sample generalization performance, outperforming other popular regularization methods.
arXiv Detail & Related papers (2023-03-09T18:57:13Z) - INO: Invariant Neural Operators for Learning Complex Physical Systems
with Momentum Conservation [8.218875461185016]
We introduce a novel integral neural operator architecture, to learn physical models with fundamental conservation laws automatically guaranteed.
As applications, we demonstrate the expressivity and efficacy of our model in learning complex material behaviors from both synthetic and experimental datasets.
arXiv Detail & Related papers (2022-12-29T16:40:41Z) - Learning Neural Causal Models with Active Interventions [83.44636110899742]
We introduce an active intervention-targeting mechanism which enables a quick identification of the underlying causal structure of the data-generating process.
Our method significantly reduces the required number of interactions compared with random intervention targeting.
We demonstrate superior performance on multiple benchmarks from simulated to real-world data.
arXiv Detail & Related papers (2021-09-06T13:10:37Z) - Neural Operator: Learning Maps Between Function Spaces [75.93843876663128]
We propose a generalization of neural networks to learn operators, termed neural operators, that map between infinite dimensional function spaces.
We prove a universal approximation theorem for our proposed neural operator, showing that it can approximate any given nonlinear continuous operator.
An important application for neural operators is learning surrogate maps for the solution operators of partial differential equations.
arXiv Detail & Related papers (2021-08-19T03:56:49Z) - Provably Efficient Neural Estimation of Structural Equation Model: An
Adversarial Approach [144.21892195917758]
We study estimation in a class of generalized Structural equation models (SEMs)
We formulate the linear operator equation as a min-max game, where both players are parameterized by neural networks (NNs), and learn the parameters of these neural networks using a gradient descent.
For the first time we provide a tractable estimation procedure for SEMs based on NNs with provable convergence and without the need for sample splitting.
arXiv Detail & Related papers (2020-07-02T17:55:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.