Related papers: Mamba Neural Operator: Who Wins? Transformers vs. State-Space Models for PDEs

Mamba Neural Operator: Who Wins? Transformers vs. State-Space Models for PDEs

URL: http://arxiv.org/abs/2410.02113v2
Date: Wed, 09 Apr 2025 21:36:19 GMT
Title: Mamba Neural Operator: Who Wins? Transformers vs. State-Space Models for PDEs
Authors: Chun-Wun Cheng, Jiahao Huang, Yi Zhang, Guang Yang, Carola-Bibiane Schönlieb, Angelica I Aviles-Rivero,
Abstract summary: Partial differential equations (PDEs) are widely used to model complex physical systems.<n>Transformers have emerged as the preferred architecture for PDEs due to their ability to capture intricate dependencies.<n>We introduce the Mamba Neural Operator (MNO), a novel framework that enhances neural operator-based techniques for solving PDEs.
Score: 14.14673083512826
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Partial differential equations (PDEs) are widely used to model complex physical systems, but solving them efficiently remains a significant challenge. Recently, Transformers have emerged as the preferred architecture for PDEs due to their ability to capture intricate dependencies. However, they struggle with representing continuous dynamics and long-range interactions. To overcome these limitations, we introduce the Mamba Neural Operator (MNO), a novel framework that enhances neural operator-based techniques for solving PDEs. MNO establishes a formal theoretical connection between structured state-space models (SSMs) and neural operators, offering a unified structure that can adapt to diverse architectures, including Transformer-based models. By leveraging the structured design of SSMs, MNO captures long-range dependencies and continuous dynamics more effectively than traditional Transformers. Through extensive analysis, we show that MNO significantly boosts the expressive power and accuracy of neural operators, making it not just a complement but a superior framework for PDE-related tasks, bridging the gap between efficient representation and accurate solution approximation.

Related papers

PMNO: A novel physics guided multi-step neural operator predictor for partial differential equations [23.04840527974364]
We propose a novel physics guided multi-step neural operator (PMNO) architecture to address challenges in long-horizon prediction of complex physical systems.<n>The PMNO framework replaces the single-step input with multi-step historical data in the forward pass and introduces an implicit time-stepping scheme during backpropagation.<n>We demonstrate the superior predictive performance of PMNO predictor across a diverse range of physical systems.
arXiv Detail & Related papers (2025-06-02T12:33:50Z)
Latent Mamba Operator for Partial Differential Equations [8.410938527671341]
We introduce the Latent Mamba Operator (LaMO), which integrates the efficiency of state-space models (SSMs) in latent space with the expressive power of kernel integral formulations in neural operators.<n>LaMOs achieve consistent state-of-the-art (SOTA) performance, with a 32.3% improvement over existing baselines in solution operator approximation.
arXiv Detail & Related papers (2025-05-25T11:51:31Z)
Instruction-Guided Autoregressive Neural Network Parameter Generation [49.800239140036496]
We propose IGPG, an autoregressive framework that unifies parameter synthesis across diverse tasks and architectures. By autoregressively generating neural network weights' tokens, IGPG ensures inter-layer coherence and enables efficient adaptation across models and datasets. Experiments on multiple datasets demonstrate that IGPG consolidates diverse pretrained models into a single, flexible generative framework.
arXiv Detail & Related papers (2025-04-02T05:50:19Z)
Efficient Transformed Gaussian Process State-Space Models for Non-Stationary High-Dimensional Dynamical Systems [49.819436680336786]
We propose an efficient transformed Gaussian process state-space model (ETGPSSM) for scalable and flexible modeling of high-dimensional, non-stationary dynamical systems. Specifically, our ETGPSSM integrates a single shared GP with input-dependent normalizing flows, yielding an expressive implicit process prior that captures complex, non-stationary transition dynamics. Our ETGPSSM outperforms existing GPSSMs and neural network-based SSMs in terms of computational efficiency and accuracy.
arXiv Detail & Related papers (2025-03-24T03:19:45Z)
Neural ODE Transformers: Analyzing Internal Dynamics and Adaptive Fine-tuning [30.781578037476347]
We introduce a novel approach to modeling transformer architectures using highly flexible non-autonomous neural ordinary differential equations (ODEs) Our proposed model parameterizes all weights of attention and feed-forward blocks through neural networks, expressing these weights as functions of a continuous layer index. Our neural ODE transformer demonstrates performance comparable to or better than vanilla transformers across various configurations and datasets.
arXiv Detail & Related papers (2025-03-03T09:12:14Z)
High-fidelity Multiphysics Modelling for Rapid Predictions Using Physics-informed Parallel Neural Operator [17.85837423448985]
Modelling complex multiphysics systems governed by nonlinear and strongly coupled partial differential equations (PDEs) is a cornerstone in computational science and engineering. We propose a novel paradigm, physics-informed parallel neural operator (PIPNO), a scalable and unsupervised learning framework. PIPNO efficiently captures nonlinear operator mappings across diverse physics, including geotechnical engineering, material science, electromagnetism, quantum mechanics, and fluid dynamics.
arXiv Detail & Related papers (2025-02-26T20:29:41Z)
Advancing Generalization in PINNs through Latent-Space Representations [71.86401914779019]
Physics-informed neural networks (PINNs) have made significant strides in modeling dynamical systems governed by partial differential equations (PDEs) We propose PIDO, a novel physics-informed neural PDE solver designed to generalize effectively across diverse PDE configurations. We validate PIDO on a range of benchmarks, including 1D combined equations and 2D Navier-Stokes equations.
arXiv Detail & Related papers (2024-11-28T13:16:20Z)
Automatically Learning Hybrid Digital Twins of Dynamical Systems [56.69628749813084]
Digital Twins (DTs) simulate the states and temporal dynamics of real-world systems. DTs often struggle to generalize to unseen conditions in data-scarce settings. In this paper, we propose an evolutionary algorithm ($textbfHDTwinGen$) to autonomously propose, evaluate, and optimize HDTwins.
arXiv Detail & Related papers (2024-10-31T07:28:22Z)
AROMA: Preserving Spatial Structure for Latent PDE Modeling with Local Neural Fields [14.219495227765671]
We present AROMA, a framework designed to enhance the modeling of partial differential equations (PDEs) using local neural fields. Our flexible encoder-decoder architecture can obtain smooth latent representations of spatial physical fields from a variety of data types. By employing a diffusion-based formulation, we achieve greater stability and enable longer rollouts compared to conventional MSE training.
arXiv Detail & Related papers (2024-06-04T10:12:09Z)
A foundational neural operator that continuously learns without forgetting [1.0878040851638]
We introduce the concept of the Neural Combinatorial Wavelet Neural Operator (NCWNO) as a foundational model for scientific computing. The NCWNO is specifically designed to excel in learning from a diverse spectrum of physics and continuously adapt to the solution operators associated with parametric partial differential equations (PDEs) The proposed foundational model offers two key advantages: (i) it can simultaneously learn solution operators for multiple parametric PDEs, and (ii) it can swiftly generalize to new parametric PDEs with minimal fine-tuning.
arXiv Detail & Related papers (2023-10-29T03:20:10Z)
Neural Operators for Accelerating Scientific Simulations and Design [85.89660065887956]
An AI framework, known as Neural Operators, presents a principled framework for learning mappings between functions defined on continuous domains. Neural Operators can augment or even replace existing simulators in many applications, such as computational fluid dynamics, weather forecasting, and material modeling.
arXiv Detail & Related papers (2023-09-27T00:12:07Z)
Deep Stochastic Processes via Functional Markov Transition Operators [59.55961312230447]
We introduce a new class of Processes (SPs) constructed by stacking sequences of neural parameterised Markov transition operators in function space. We prove that these Markov transition operators can preserve the exchangeability and consistency of SPs.
arXiv Detail & Related papers (2023-05-24T21:15:23Z)
Learning PDE Solution Operator for Continuous Modeling of Time-Series [1.39661494747879]
This work presents a partial differential equation (PDE) based framework which improves the dynamics modeling capability. We propose a neural operator that can handle time continuously without requiring iterative operations or specific grids of temporal discretization. Our framework opens up a new way for a continuous representation of neural networks that can be readily adopted for real-world applications.
arXiv Detail & Related papers (2023-02-02T03:47:52Z)
Solving High-Dimensional PDEs with Latent Spectral Models [74.1011309005488]
We present Latent Spectral Models (LSM) toward an efficient and precise solver for high-dimensional PDEs. Inspired by classical spectral methods in numerical analysis, we design a neural spectral block to solve PDEs in the latent space. LSM achieves consistent state-of-the-art and yields a relative gain of 11.5% averaged on seven benchmarks.
arXiv Detail & Related papers (2023-01-30T04:58:40Z)
LordNet: An Efficient Neural Network for Learning to Solve Parametric Partial Differential Equations without Simulated Data [47.49194807524502]
We propose LordNet, a tunable and efficient neural network for modeling entanglements. The experiments on solving Poisson's equation and (2D and 3D) Navier-Stokes equation demonstrate that the long-range entanglements can be well modeled by the LordNet.
arXiv Detail & Related papers (2022-06-19T14:41:08Z)
Neural Operator with Regularity Structure for Modeling Dynamics Driven by SPDEs [70.51212431290611]
Partial differential equations (SPDEs) are significant tools for modeling dynamics in many areas including atmospheric sciences and physics. We propose the Neural Operator with Regularity Structure (NORS) which incorporates the feature vectors for modeling dynamics driven by SPDEs. We conduct experiments on various of SPDEs including the dynamic Phi41 model and the 2d Navier-Stokes equation.
arXiv Detail & Related papers (2022-04-13T08:53:41Z)
Frame invariance and scalability of neural operators for partial differential equations [5.872676314924041]
Partial differential equations (PDEs) play a dominant role in the mathematical modeling of many complex dynamical processes. After training, neural operators can provide PDEs solutions significantly faster than traditional PDE solvers.
arXiv Detail & Related papers (2021-12-28T02:36:19Z)

This list is automatically generated from the titles and abstracts of the papers in this site.