GNOT: A General Neural Operator Transformer for Operator Learning
- URL: http://arxiv.org/abs/2302.14376v3
- Date: Wed, 14 Jun 2023 12:26:03 GMT
- Title: GNOT: A General Neural Operator Transformer for Operator Learning
- Authors: Zhongkai Hao, Zhengyi Wang, Hang Su, Chengyang Ying, Yinpeng Dong,
Songming Liu, Ze Cheng, Jian Song, Jun Zhu
- Abstract summary: General neural operator transformer (GNOT) is a scalable and effective framework for learning operators.
By designing a novel heterogeneous normalized attention layer, our model is highly flexible to handle multiple input functions and irregular meshes.
The large model capacity of the transformer architecture grants our model the possibility to scale to large datasets and practical problems.
- Score: 34.79481320566005
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Learning partial differential equations' (PDEs) solution operators is an
essential problem in machine learning. However, there are several challenges
for learning operators in practical applications like the irregular mesh,
multiple input functions, and complexity of the PDEs' solution. To address
these challenges, we propose a general neural operator transformer (GNOT), a
scalable and effective transformer-based framework for learning operators. By
designing a novel heterogeneous normalized attention layer, our model is highly
flexible to handle multiple input functions and irregular meshes. Besides, we
introduce a geometric gating mechanism which could be viewed as a soft domain
decomposition to solve the multi-scale problems. The large model capacity of
the transformer architecture grants our model the possibility to scale to large
datasets and practical problems. We conduct extensive experiments on multiple
challenging datasets from different domains and achieve a remarkable
improvement compared with alternative methods. Our code and data are publicly
available at \url{https://github.com/thu-ml/GNOT}.
Related papers
- Learning Linear Attention in Polynomial Time [115.68795790532289]
We provide the first results on learnability of single-layer Transformers with linear attention.
We show that linear attention may be viewed as a linear predictor in a suitably defined RKHS.
We show how to efficiently identify training datasets for which every empirical riskr is equivalent to the linear Transformer.
arXiv Detail & Related papers (2024-10-14T02:41:01Z) - Transfer Operator Learning with Fusion Frame [0.0]
This work presents a novel framework that enhances the transfer learning capabilities of operator learning models for solving Partial Differential Equations (PDEs)
We introduce an innovative architecture that combines fusion frames with POD-DeepONet, demonstrating superior performance across various PDEs in our experimental analysis.
Our framework addresses the critical challenge of transfer learning in operator learning models, paving the way for adaptable and efficient solutions across a wide range of scientific and engineering applications.
arXiv Detail & Related papers (2024-08-20T00:03:23Z) - Transformers as Neural Operators for Solutions of Differential Equations with Finite Regularity [1.6874375111244329]
We first establish the theoretical groundwork that transformers possess the universal approximation property as operator learning models.
In particular, we consider three examples: the Izhikevich neuron model, the tempered fractional-order Leaky Integrate-and-Fire (LIFLIF) model, and the one-dimensional equation Euler problem.
arXiv Detail & Related papers (2024-05-29T15:10:24Z) - HAMLET: Graph Transformer Neural Operator for Partial Differential Equations [13.970458554623939]
We present a novel graph transformer framework, HAMLET, designed to address the challenges in solving partial differential equations (PDEs) using neural networks.
The framework uses graph transformers with modular input encoders to directly incorporate differential equation information into the solution process.
Notably, HAMLET scales effectively with increasing data complexity and noise, showcasing its robustness.
arXiv Detail & Related papers (2024-02-05T21:55:24Z) - GIT-Net: Generalized Integral Transform for Operator Learning [58.13313857603536]
This article introduces GIT-Net, a deep neural network architecture for approximating Partial Differential Equation (PDE) operators.
GIT-Net harnesses the fact that differential operators commonly used for defining PDEs can often be represented parsimoniously when expressed in specialized functional bases.
Numerical experiments demonstrate that GIT-Net is a competitive neural network operator, exhibiting small test errors and low evaluations across a range of PDE problems.
arXiv Detail & Related papers (2023-12-05T03:03:54Z) - Multi-Grid Tensorized Fourier Neural Operator for High-Resolution PDEs [93.82811501035569]
We introduce a new data efficient and highly parallelizable operator learning approach with reduced memory requirement and better generalization.
MG-TFNO scales to large resolutions by leveraging local and global structures of full-scale, real-world phenomena.
We demonstrate superior performance on the turbulent Navier-Stokes equations where we achieve less than half the error with over 150x compression.
arXiv Detail & Related papers (2023-09-29T20:18:52Z) - FedYolo: Augmenting Federated Learning with Pretrained Transformers [61.56476056444933]
In this work, we investigate pretrained transformers (PTF) to achieve on-device learning goals.
We show that larger scale shrinks the accuracy gaps between alternative approaches and improves robustness.
Finally, it enables clients to solve multiple unrelated tasks simultaneously using a single PTF.
arXiv Detail & Related papers (2023-07-10T21:08:52Z) - On-Device Domain Generalization [93.79736882489982]
Domain generalization is critical to on-device machine learning applications.
We find that knowledge distillation is a strong candidate for solving the problem.
We propose a simple idea called out-of-distribution knowledge distillation (OKD), which aims to teach the student how the teacher handles (synthetic) out-of-distribution data.
arXiv Detail & Related papers (2022-09-15T17:59:31Z) - Learning the Solution Operator of Boundary Value Problems using Graph
Neural Networks [0.0]
We design a general solution operator for two different time-independent PDEs using graph neural networks (GNNs) and spectral graph convolutions.
We train the networks on simulated data from a finite elements solver on a variety of shapes and inhomogeneities.
We find that training on a diverse dataset with lots of variation in the finite element meshes is a key ingredient for achieving good generalization results.
arXiv Detail & Related papers (2022-06-28T15:39:06Z) - Deep invariant networks with differentiable augmentation layers [87.22033101185201]
Methods for learning data augmentation policies require held-out data and are based on bilevel optimization problems.
We show that our approach is easier and faster to train than modern automatic data augmentation techniques.
arXiv Detail & Related papers (2022-02-04T14:12:31Z) - Scalable Gaussian Processes for Data-Driven Design using Big Data with
Categorical Factors [14.337297795182181]
Gaussian processes (GP) have difficulties in accommodating big datasets, categorical inputs, and multiple responses.
We propose a GP model that utilizes latent variables and functions obtained through variational inference to address the aforementioned challenges simultaneously.
Our approach is demonstrated for machine learning of ternary oxide materials and topology optimization of a multiscale compliant mechanism.
arXiv Detail & Related papers (2021-06-26T02:17:23Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.