Related papers: Efficient Transformer-Inspired Variants of Physics-Informed Deep Operator Networks

Efficient Transformer-Inspired Variants of Physics-Informed Deep Operator Networks

URL: http://arxiv.org/abs/2509.01679v1
Date: Mon, 01 Sep 2025 18:01:23 GMT
Title: Efficient Transformer-Inspired Variants of Physics-Informed Deep Operator Networks
Authors: Zhi-Feng Wei, Wenqian Chen, Panos Stinis,
Abstract summary: Transformer-inspired DeepONet variants introduce bidirectional cross-conditioning between the branch and trunk networks in DeepONet.<n>Experiments on four PDE benchmarks show that for each case, there exists a variant that matches or surpasses the accuracy of the modified DeepONet.
Score: 0.509780930114934
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Operator learning has emerged as a promising tool for accelerating the solution of partial differential equations (PDEs). The Deep Operator Networks (DeepONets) represent a pioneering framework in this area: the "vanilla" DeepONet is valued for its simplicity and efficiency, while the modified DeepONet achieves higher accuracy at the cost of increased training time. In this work, we propose a series of Transformer-inspired DeepONet variants that introduce bidirectional cross-conditioning between the branch and trunk networks in DeepONet. Query-point information is injected into the branch network and input-function information into the trunk network, enabling dynamic dependencies while preserving the simplicity and efficiency of the "vanilla" DeepONet in a non-intrusive manner. Experiments on four PDE benchmarks -- advection, diffusion-reaction, Burgers', and Korteweg-de Vries equations -- show that for each case, there exists a variant that matches or surpasses the accuracy of the modified DeepONet while offering improved training efficiency. Moreover, the best-performing variant for each equation aligns naturally with the equation's underlying characteristics, suggesting that the effectiveness of cross-conditioning depends on the characteristics of the equation and its underlying physics. To ensure robustness, we validate the effectiveness of our variants through a range of rigorous statistical analyses, among them the Wilcoxon Two One-Sided Test, Glass's Delta, and Spearman's rank correlation.

Related papers

TopoCurate:Modeling Interaction Topology for Tool-Use Agent Training [53.93696896939915]
Training tool-use agents typically rely on Supervised Fine-Tuning (SFT) on successful trajectories and Reinforcement Learning (RL) on pass-rate-selected tasks.<n>We propose TopoCurate, an interaction-aware framework that projects multi-trial rollouts from the same task into a unified semantic quotient topology.<n>TopoCurate achieves consistent gains of 4.2% (SFT) and 6.9% (RL) over state-of-the-art baselines.
arXiv Detail & Related papers (2026-03-02T10:38:54Z)
AMORE: Adaptive Multi-Output Operator Network for Stiff Chemical Kinetics [4.621457883636921]
Time integration of stiff systems is a primary source of computational cost in combustion, hypersonics, and other reactive transport systems.<n>We develop AMORE, a framework comprising an operator capable of predicting multiple outputs and adaptive loss functions.
arXiv Detail & Related papers (2025-10-15T00:43:30Z)
Fractional Spike Differential Equations Neural Network with Efficient Adjoint Parameters Training [63.3991315762955]
Spiking Neural Networks (SNNs) draw inspiration from biological neurons to create realistic models for brain-like computation.<n>Most existing SNNs assume a single time constant for neuronal membrane voltage dynamics, modeled by first-order ordinary differential equations (ODEs) with Markovian characteristics.<n>We propose the Fractional SPIKE Differential Equation neural network (fspikeDE), which captures long-term dependencies in membrane voltage and spike trains through fractional-order dynamics.
arXiv Detail & Related papers (2025-07-22T18:20:56Z)
SetONet: A Deep Set-based Operator Network for Solving PDEs with permutation invariant variable input sampling [2.95983424663256]
We introduce the Set Operator Network (SetONet), a novel architecture that integrates Deep Sets principles into the DeepONet framework.<n>The core innovation lies in the SetONet branch network, which processes the input function as an unordered emphset of location-value pairs.<n>We demonstrate SetONet's effectiveness on several benchmark problems, including derivative/anti-derivative operators, 1D Darcy flow, and 2D elasticity.
arXiv Detail & Related papers (2025-05-07T18:50:05Z)
BO-SA-PINNs: Self-adaptive physics-informed neural networks based on Bayesian optimization for automatically designing PDE solvers [13.048817629665649]
Physics-informed neural networks (PINNs) are a popular alternative method for solving partial differential equations (PDEs)<n>PINNs require dedicated manual modifications to the hyperparameters of the network, the sampling methods and loss function weights for different PDEs, which reduces the efficiency of the solvers.<n>We propose a general multi-stage framework, i.e. BO-SA-PINNs, to alleviate this issue.
arXiv Detail & Related papers (2025-04-14T02:07:45Z)
On the Role of Feedback in Test-Time Scaling of Agentic AI Workflows [71.92083784393418]
Agentic AI (systems that autonomously plan and act) are becoming widespread, yet their task success rate on complex tasks remains low.<n>Inference-time alignment relies on three components: sampling, evaluation, and feedback.<n>We introduce Iterative Agent Decoding (IAD), a procedure that repeatedly inserts feedback extracted from different forms of critiques.
arXiv Detail & Related papers (2025-04-02T17:40:47Z)
DeepONet Augmented by Randomized Neural Networks for Efficient Operator Learning in PDEs [5.84093922354671]
We propose RaNN-DeepONets, a hybrid architecture designed to balance accuracy and efficiency.<n>RaNN-DeepONets achieves comparable accuracy while reducing computational costs by orders of magnitude.<n>These results highlight the potential of RaNN-DeepONets as an efficient alternative for operator learning in PDE-based systems.
arXiv Detail & Related papers (2025-03-01T03:05:29Z)
Toward Relative Positional Encoding in Spiking Transformers [52.62008099390541]
Spiking neural networks (SNNs) are bio-inspired networks that mimic how neurons in the brain communicate through discrete spikes.<n>We introduce several strategies to approximate relative positional encoding (RPE) in spiking Transformers.
arXiv Detail & Related papers (2025-01-28T06:42:37Z)
Alpha-VI DeepONet: A prior-robust variational Bayesian approach for enhancing DeepONets with uncertainty quantification [0.0]
We introduce a novel deep operator network (DeepONet) framework that incorporates generalised variational inference (GVI) By incorporating Bayesian neural networks as the building blocks for the branch and trunk networks, our framework endows DeepONet with uncertainty quantification. We demonstrate that modifying the variational objective function yields superior results in terms of minimising the mean squared error.
arXiv Detail & Related papers (2024-08-01T16:22:03Z)
WiNet: Wavelet-based Incremental Learning for Efficient Medical Image Registration [68.25711405944239]
Deep image registration has demonstrated exceptional accuracy and fast inference. Recent advances have adopted either multiple cascades or pyramid architectures to estimate dense deformation fields in a coarse-to-fine manner. We introduce a model-driven WiNet that incrementally estimates scale-wise wavelet coefficients for the displacement/velocity field across various scales.
arXiv Detail & Related papers (2024-07-18T11:51:01Z)
SWAP: Sparse Entropic Wasserstein Regression for Robust Network Pruning [9.60349706518775]
This study addresses the challenge of inaccurate gradients in computing the empirical Fisher Information Matrix during neural network pruning. We introduce SWAP, a formulation of Entropic Wasserstein regression (EWR) for pruning, capitalizing on the geometric properties of the optimal transport problem. Our proposed method achieves a gain of 6% improvement in accuracy and 8% improvement in testing loss for MobileNetV1 with less than one-fourth of the network parameters remaining.
arXiv Detail & Related papers (2023-10-07T21:15:32Z)
Accelerated replica exchange stochastic gradient Langevin diffusion enhanced Bayesian DeepONet for solving noisy parametric PDEs [7.337247167823921]
We propose a training framework for replica-exchange Langevin diffusion that exploits the neural network architecture of DeepONets. We show that the proposed framework's exploration and exploitation capabilities enable improved training convergence for DeepONets in noisy scenarios. We also show that replica-exchange Langeving Diffusion also improves the DeepONet's mean prediction accuracy in noisy scenarios.
arXiv Detail & Related papers (2021-11-03T19:23:59Z)
Pairwise Supervised Hashing with Bernoulli Variational Auto-Encoder and Self-Control Gradient Estimator [62.26981903551382]
Variational auto-encoders (VAEs) with binary latent variables provide state-of-the-art performance in terms of precision for document retrieval. We propose a pairwise loss function with discrete latent VAE to reward within-class similarity and between-class dissimilarity for supervised hashing. This new semantic hashing framework achieves superior performance compared to the state-of-the-arts.
arXiv Detail & Related papers (2020-05-21T06:11:33Z)
Faster Depth-Adaptive Transformers [71.20237659479703]
Depth-adaptive neural networks can dynamically adjust depths according to the hardness of input words. Previous works generally build a halting unit to decide whether the computation should continue or stop at each layer. In this paper, we get rid of the halting unit and estimate the required depths in advance, which yields a faster depth-adaptive model.
arXiv Detail & Related papers (2020-04-27T15:08:10Z)

This list is automatically generated from the titles and abstracts of the papers in this site.