Transformer for Partial Differential Equations' Operator Learning
- URL: http://arxiv.org/abs/2205.13671v3
- Date: Thu, 27 Apr 2023 21:01:23 GMT
- Title: Transformer for Partial Differential Equations' Operator Learning
- Authors: Zijie Li, Kazem Meidani, Amir Barati Farimani
- Abstract summary: We present an attention-based framework for data-driven operator learning, which we term Operator Transformer (OFormer)
Our framework is built upon self-attention, cross-attention, and a set of point-wise multilayer perceptrons (MLPs)
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Data-driven learning of partial differential equations' solution operators
has recently emerged as a promising paradigm for approximating the underlying
solutions. The solution operators are usually parameterized by deep learning
models that are built upon problem-specific inductive biases. An example is a
convolutional or a graph neural network that exploits the local grid structure
where functions' values are sampled. The attention mechanism, on the other
hand, provides a flexible way to implicitly exploit the patterns within inputs,
and furthermore, relationship between arbitrary query locations and inputs. In
this work, we present an attention-based framework for data-driven operator
learning, which we term Operator Transformer (OFormer). Our framework is built
upon self-attention, cross-attention, and a set of point-wise multilayer
perceptrons (MLPs), and thus it makes few assumptions on the sampling pattern
of the input function or query locations. We show that the proposed framework
is competitive on standard benchmark problems and can flexibly be adapted to
randomly sampled input.
Related papers
- Unsupervised Representation Learning from Sparse Transformation Analysis [79.94858534887801]
We propose to learn representations from sequence data by factorizing the transformations of the latent variables into sparse components.
Input data are first encoded as distributions of latent activations and subsequently transformed using a probability flow model.
arXiv Detail & Related papers (2024-10-07T23:53:25Z) - PROSE: Predicting Operators and Symbolic Expressions using Multimodal
Transformers [5.263113622394007]
We develop a new neural network framework for predicting differential equations.
By using a transformer structure and a feature fusion approach, our network can simultaneously embed sets of solution operators for various parametric differential equations.
The network is shown to be able to handle noise in the data and errors in the symbolic representation, including noisy numerical values, model misspecification, and erroneous addition or deletion of terms.
arXiv Detail & Related papers (2023-09-28T19:46:07Z) - Self-Supervised Learning for Group Equivariant Neural Networks [75.62232699377877]
Group equivariant neural networks are the models whose structure is restricted to commute with the transformations on the input.
We propose two concepts for self-supervised tasks: equivariant pretext labels and invariant contrastive loss.
Experiments on standard image recognition benchmarks demonstrate that the equivariant neural networks exploit the proposed self-supervised tasks.
arXiv Detail & Related papers (2023-03-08T08:11:26Z) - Variational Autoencoding Neural Operators [17.812064311297117]
Unsupervised learning with functional data is an emerging paradigm of machine learning research with applications to computer vision, climate modeling and physical systems.
We present Variational Autoencoding Neural Operators (VANO), a general strategy for making a large class of operator learning architectures act as variational autoencoders.
arXiv Detail & Related papers (2023-02-20T22:34:43Z) - Equivariance with Learned Canonicalization Functions [77.32483958400282]
We show that learning a small neural network to perform canonicalization is better than using predefineds.
Our experiments show that learning the canonicalization function is competitive with existing techniques for learning equivariant functions across many tasks.
arXiv Detail & Related papers (2022-11-11T21:58:15Z) - Amortized Inference for Causal Structure Learning [72.84105256353801]
Learning causal structure poses a search problem that typically involves evaluating structures using a score or independence test.
We train a variational inference model to predict the causal structure from observational/interventional data.
Our models exhibit robust generalization capabilities under substantial distribution shift.
arXiv Detail & Related papers (2022-05-25T17:37:08Z) - Learning Operators with Coupled Attention [9.715465024071333]
We propose a novel operator learning method, LOCA, motivated from the recent success of the attention mechanism.
In our architecture the input functions are mapped to a finite set of features which are then averaged with attention weights that depend on the output query locations.
By coupling these attention weights together with an integral transform, LOCA is able to explicitly learn correlations in the target output functions.
arXiv Detail & Related papers (2022-01-04T08:22:03Z) - A research framework for writing differentiable PDE discretizations in
JAX [3.4389358108344257]
Differentiable simulators are an emerging concept with applications in several fields, from reinforcement learning to optimal control.
We propose a library of differentiable operators and discretizations, by representing operators as mappings between families of continuous functions, parametrized by finite vectors.
We demonstrate the approach on an acoustic optimization problem, where the Helmholtz equation is discretized using Fourier spectral methods, and differentiability is demonstrated using gradient descent to optimize the speed of sound of an acoustic lens.
arXiv Detail & Related papers (2021-11-09T15:58:44Z) - Autoencoding Variational Autoencoder [56.05008520271406]
We study the implications of this behaviour on the learned representations and also the consequences of fixing it by introducing a notion of self consistency.
We show that encoders trained with our self-consistency approach lead to representations that are robust (insensitive) to perturbations in the input introduced by adversarial attacks.
arXiv Detail & Related papers (2020-12-07T14:16:14Z) - Joint learning of variational representations and solvers for inverse
problems with partially-observed data [13.984814587222811]
In this paper, we design an end-to-end framework allowing to learn actual variational frameworks for inverse problems in a supervised setting.
The variational cost and the gradient-based solver are both stated as neural networks using automatic differentiation for the latter.
This leads to a data-driven discovery of variational models.
arXiv Detail & Related papers (2020-06-05T19:53:34Z) - Learning What Makes a Difference from Counterfactual Examples and
Gradient Supervision [57.14468881854616]
We propose an auxiliary training objective that improves the generalization capabilities of neural networks.
We use pairs of minimally-different examples with different labels, a.k.a counterfactual or contrasting examples, which provide a signal indicative of the underlying causal structure of the task.
Models trained with this technique demonstrate improved performance on out-of-distribution test sets.
arXiv Detail & Related papers (2020-04-20T02:47:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.