Related papers: SUPRA: Subspace Parameterized Attention for Neural Operator on General Domains

SUPRA: Subspace Parameterized Attention for Neural Operator on General Domains

URL: http://arxiv.org/abs/2504.15897v1
Date: Tue, 22 Apr 2025 13:40:04 GMT
Title: SUPRA: Subspace Parameterized Attention for Neural Operator on General Domains
Authors: Zherui Yang, Zhengyang Xue, Ligang Liu,
Abstract summary: Subspace ized Attention (SUPRA) neural operator approximates the attention mechanism within a finite-dimensional subspace.<n>We show that SUPRA reduces error rates by up to 33% on various PDE datasets while maintaining state-of-the-art computational efficiency.
Score: 19.70999041826902
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Neural operators are efficient surrogate models for solving partial differential equations (PDEs), but their key components face challenges: (1) in order to improve accuracy, attention mechanisms suffer from computational inefficiency on large-scale meshes, and (2) spectral convolutions rely on the Fast Fourier Transform (FFT) on regular grids and assume a flat geometry, which causes accuracy degradation on irregular domains. To tackle these problems, we regard the matrix-vector operations in the standard attention mechanism on vectors in Euclidean space as bilinear forms and linear operators in vector spaces and generalize the attention mechanism to function spaces. This new attention mechanism is fully equivalent to the standard attention but impossible to compute due to the infinite dimensionality of function spaces. To address this, inspired by model reduction techniques, we propose a Subspace Parameterized Attention (SUPRA) neural operator, which approximates the attention mechanism within a finite-dimensional subspace. To construct a subspace on irregular domains for SUPRA, we propose using the Laplacian eigenfunctions, which naturally adapt to domains' geometry and guarantee the optimal approximation for smooth functions. Experiments show that the SUPRA neural operator reduces error rates by up to 33% on various PDE datasets while maintaining state-of-the-art computational efficiency.

Related papers

Integrating Locality-Aware Attention with Transformers for General Geometry PDEs [24.336598771550157]
We propose the Locality-Aware Attention Transformer (LA2Former) for learning mappings governed by partial differential equations (PDEs) By combining linear attention for efficient global context encoding with pairwise attention for capturing intricate local interactions, LA2Former achieves an optimal balance between computational efficiency and predictive accuracy. This work underscores the critical importance of localized feature learning in advancing Transformer-based neural operators for solving PDEs on complex and irregular domains.
arXiv Detail & Related papers (2025-04-18T05:43:49Z)
Partial Transportability for Domain Generalization [56.37032680901525]
Building on the theory of partial identification and transportability, this paper introduces new results for bounding the value of a functional of the target distribution.<n>Our contribution is to provide the first general estimation technique for transportability problems.<n>We propose a gradient-based optimization scheme for making scalable inferences in practice.
arXiv Detail & Related papers (2025-03-30T22:06:37Z)
3D Equivariant Pose Regression via Direct Wigner-D Harmonics Prediction [50.07071392673984]
Existing methods learn 3D rotations parametrized in the spatial domain using angles or quaternions. We propose a frequency-domain approach that directly predicts Wigner-D coefficients for 3D rotation regression. Our method achieves state-of-the-art results on benchmarks such as ModelNet10-SO(3) and PASCAL3D+.
arXiv Detail & Related papers (2024-11-01T12:50:38Z)
Shape-informed surrogate models based on signed distance function domain encoding [8.052704959617207]
We propose a non-intrusive method to build surrogate models that approximate the solution of parameterized partial differential equations (PDEs) Our approach is based on the combination of two neural networks (NNs)
arXiv Detail & Related papers (2024-09-19T01:47:04Z)
Spectral Algorithms on Manifolds through Diffusion [1.7227952883644062]
We study the convergence performance of spectral algorithms in the Reproducing Kernel Space. We employ integral operator techniques to derive tight convergence upper bounds concerning generalized norms. Our study confirms that the spectral algorithms are practically significant in the broader context of high-dimensional approximation.
arXiv Detail & Related papers (2024-03-06T12:43:53Z)
Dynamic Gaussian Graph Operator: Learning parametric partial differential equations in arbitrary discrete mechanics problems [33.32926047057572]
We propose a novel operator learning algorithm that expands neural operators to learning parametric PDEs in arbitrary discrete mechanics problems. The efficiency and robustness of DGGO are validated by applying it to solve numerical arbitrary discrete mechanics problems. The proposed method is utilized to forecast stress field of hyper-elastic material with geometrically variable void as engineering application.
arXiv Detail & Related papers (2024-03-05T09:25:31Z)
Improved Operator Learning by Orthogonal Attention [17.394770071994145]
We develop an attention based on the eigendecomposition of the kernel integral operator and the neural approximation of eigenfunctions.<n>Our method can outperform competing baselines with decent margins.
arXiv Detail & Related papers (2023-10-19T05:47:28Z)
Multi-Grid Tensorized Fourier Neural Operator for High-Resolution PDEs [93.82811501035569]
We introduce a new data efficient and highly parallelizable operator learning approach with reduced memory requirement and better generalization. MG-TFNO scales to large resolutions by leveraging local and global structures of full-scale, real-world phenomena. We demonstrate superior performance on the turbulent Navier-Stokes equations where we achieve less than half the error with over 150x compression.
arXiv Detail & Related papers (2023-09-29T20:18:52Z)
Beyond Regular Grids: Fourier-Based Neural Operators on Arbitrary Domains [13.56018270837999]
We propose a simple method to extend neural operators to arbitrary domains. An efficient implementation* of such direct spectral evaluations is coupled with existing neural operator models. We demonstrate that the proposed method allows us to extend neural operators to arbitrary point distributions with significant gains in training speed over baselines.
arXiv Detail & Related papers (2023-05-31T09:01:20Z)
Solving High-Dimensional PDEs with Latent Spectral Models [74.1011309005488]
We present Latent Spectral Models (LSM) toward an efficient and precise solver for high-dimensional PDEs. Inspired by classical spectral methods in numerical analysis, we design a neural spectral block to solve PDEs in the latent space. LSM achieves consistent state-of-the-art and yields a relative gain of 11.5% averaged on seven benchmarks.
arXiv Detail & Related papers (2023-01-30T04:58:40Z)
Detecting Rotated Objects as Gaussian Distributions and Its 3-D Generalization [81.29406957201458]
Existing detection methods commonly use a parameterized bounding box (BBox) to model and detect (horizontal) objects. We argue that such a mechanism has fundamental limitations in building an effective regression loss for rotation detection. We propose to model the rotated objects as Gaussian distributions. We extend our approach from 2-D to 3-D with a tailored algorithm design to handle the heading estimation.
arXiv Detail & Related papers (2022-09-22T07:50:48Z)
Minimax Estimation of Linear Functions of Eigenvectors in the Face of Small Eigen-Gaps [95.62172085878132]
Eigenvector perturbation analysis plays a vital role in various statistical data science applications. We develop a suite of statistical theory that characterizes the perturbation of arbitrary linear functions of an unknown eigenvector. In order to mitigate a non-negligible bias issue inherent to the natural "plug-in" estimator, we develop de-biased estimators.
arXiv Detail & Related papers (2021-04-07T17:55:10Z)
Understanding Implicit Regularization in Over-Parameterized Single Index Model [55.41685740015095]
We design regularization-free algorithms for the high-dimensional single index model. We provide theoretical guarantees for the induced implicit regularization phenomenon.
arXiv Detail & Related papers (2020-07-16T13:27:47Z)

This list is automatically generated from the titles and abstracts of the papers in this site.