Recurrent Neural Operators: Stable Long-Term PDE Prediction
- URL: http://arxiv.org/abs/2505.20721v1
- Date: Tue, 27 May 2025 05:04:35 GMT
- Title: Recurrent Neural Operators: Stable Long-Term PDE Prediction
- Authors: Zaijun Ye, Chen-Song Zhang, Wansheng Wang,
- Abstract summary: We propose Recurrent Neural Operators (RNOs) to integrate recurrent training into neural operator architectures.<n>RNOs apply the operator to their own predictions over a temporal window, effectively simulating inference-time dynamics during training.<n>We show that recurrent training can reduce the worst-case exponential error growth typical of teacher forcing to linear growth.
- Score: 0.0
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Neural operators have emerged as powerful tools for learning solution operators of partial differential equations. However, in time-dependent problems, standard training strategies such as teacher forcing introduce a mismatch between training and inference, leading to compounding errors in long-term autoregressive predictions. To address this issue, we propose Recurrent Neural Operators (RNOs)-a novel framework that integrates recurrent training into neural operator architectures. Instead of conditioning each training step on ground-truth inputs, RNOs recursively apply the operator to their own predictions over a temporal window, effectively simulating inference-time dynamics during training. This alignment mitigates exposure bias and enhances robustness to error accumulation. Theoretically, we show that recurrent training can reduce the worst-case exponential error growth typical of teacher forcing to linear growth. Empirically, we demonstrate that recurrently trained Multigrid Neural Operators significantly outperform their teacher-forced counterparts in long-term accuracy and stability on standard benchmarks. Our results underscore the importance of aligning training with inference dynamics for robust temporal generalization in neural operator learning.
Related papers
- Learning Physical Operators using Neural Operators [10.57578521926415]
We train neural operators to learn individual non-linear physical operators while approximating linear operators with fixed finite-difference convolutions.<n>We formulate the modelling task as a neural ordinary differential equation (ODE) where these learned operators constitute the right-hand side.<n>Our approach achieves better convergence and superior performance when generalising to unseen physics.
arXiv Detail & Related papers (2026-02-26T15:27:14Z) - When Learning Hurts: Fixed-Pole RNN for Real-Time Online Training [58.25341036646294]
We analytically examine why learning recurrent poles does not provide tangible benefits in data and empirically offer real-time learning scenarios.<n>We show that fixed-pole networks achieve superior performance with lower training complexity, making them more suitable for online real-time tasks.
arXiv Detail & Related papers (2026-02-25T00:15:13Z) - Adaptive recurrent flow map operator learning for reaction diffusion dynamics [0.9137554315375919]
We develop an operator learner with adaptive recurrent training (DDOL-ART) using a robust recurrent strategy with lightweight validation milestones.<n>DDOL-ART learns one-step operators that remain stable under long rollouts and generalize zero-shot to strong shifts.<n>It is several-fold faster than a physics-based numerical-loss operator learner (NLOL) under matched settings.
arXiv Detail & Related papers (2026-02-10T07:33:13Z) - MesaNet: Sequence Modeling by Locally Optimal Test-Time Training [67.45211108321203]
We introduce a numerically stable, chunkwise parallelizable version of the recently proposed Mesa layer.<n>We show that optimal test-time training enables reaching lower language modeling perplexity and higher downstream benchmark performance than previous RNNs.
arXiv Detail & Related papers (2025-06-05T16:50:23Z) - Temporal Neural Operator for Modeling Time-Dependent Physical Phenomena [0.0]
Neural (NOs) are machine learning models designed to solve differential equations (PDEs) by partial learning to map between function spaces.<n>They struggle in mapping the temporal dynamics of time-dependent PDEs, especially for time steps not explicitly seen during training.<n>Most NOs tend to be prohibitively costly to train, especially for higher-dimensional PDEs.<n>We propose the Temporal Neural Operator (TNO), an efficient neural operator designed for time-dependent PDEs.
arXiv Detail & Related papers (2025-04-28T20:40:19Z) - Neural Operator Learning for Long-Time Integration in Dynamical Systems with Recurrent Neural Networks [1.6874375111244329]
Deep neural networks offer reduced computational costs during inference and can be trained directly from observational data.
Existing methods, however, cannot extrapolate accurately and are prone to error accumulation in long-time integration.
We address this issue by combining neural operators with recurrent neural networks, learning the operator mapping, while offering a recurrent structure to capture temporal dependencies.
arXiv Detail & Related papers (2023-03-03T22:19:23Z) - Monte Carlo Neural PDE Solver for Learning PDEs via Probabilistic Representation [59.45669299295436]
We propose a Monte Carlo PDE solver for training unsupervised neural solvers.<n>We use the PDEs' probabilistic representation, which regards macroscopic phenomena as ensembles of random particles.<n>Our experiments on convection-diffusion, Allen-Cahn, and Navier-Stokes equations demonstrate significant improvements in accuracy and efficiency.
arXiv Detail & Related papers (2023-02-10T08:05:19Z) - Modeling Nonlinear Dynamics in Continuous Time with Inductive Biases on
Decay Rates and/or Frequencies [37.795752939016225]
We propose a neural network-based model for nonlinear dynamics in continuous time that can impose inductive biases on decay rates and frequencies.
We use neural networks to find an appropriate Koopman space, which are trained by minimizing multi-step forecasting and backcasting errors using irregularly sampled time-series data.
arXiv Detail & Related papers (2022-12-26T08:08:43Z) - Addressing Mistake Severity in Neural Networks with Semantic Knowledge [0.0]
Most robust training techniques aim to improve model accuracy on perturbed inputs.
As an alternate form of robustness, we aim to reduce the severity of mistakes made by neural networks in challenging conditions.
We leverage current adversarial training methods to generate targeted adversarial attacks during the training process.
Results demonstrate that our approach performs better with respect to mistake severity compared to standard and adversarially trained models.
arXiv Detail & Related papers (2022-11-21T22:01:36Z) - Intelligence Processing Units Accelerate Neuromorphic Learning [52.952192990802345]
Spiking neural networks (SNNs) have achieved orders of magnitude improvement in terms of energy consumption and latency.
We present an IPU-optimized release of our custom SNN Python package, snnTorch.
arXiv Detail & Related papers (2022-11-19T15:44:08Z) - Stabilizing Machine Learning Prediction of Dynamics: Noise and
Noise-inspired Regularization [58.720142291102135]
Recent has shown that machine learning (ML) models can be trained to accurately forecast the dynamics of chaotic dynamical systems.
In the absence of mitigating techniques, this technique can result in artificially rapid error growth, leading to inaccurate predictions and/or climate instability.
We introduce Linearized Multi-Noise Training (LMNT), a regularization technique that deterministically approximates the effect of many small, independent noise realizations added to the model input during training.
arXiv Detail & Related papers (2022-11-09T23:40:52Z) - What training reveals about neural network complexity [80.87515604428346]
This work explores the hypothesis that the complexity of the function a deep neural network (NN) is learning can be deduced by how fast its weights change during training.
Our results support the hypothesis that good training behavior can be a useful bias towards good generalization.
arXiv Detail & Related papers (2021-06-08T08:58:00Z) - Fast Training of Deep Neural Networks Robust to Adversarial
Perturbations [0.0]
We show that a fast approximation to adversarial training shows promise for reducing training time and maintaining robustness.
Fast adversarial training is a promising approach that will provide increased security and explainability in machine learning applications.
arXiv Detail & Related papers (2020-07-08T00:35:39Z) - Age-Based Coded Computation for Bias Reduction in Distributed Learning [57.9123881133818]
Coded computation can be used to speed up distributed learning in the presence of straggling workers.
Partial recovery of the gradient vector can further reduce the computation time at each iteration.
Estimator bias will be particularly prevalent when the straggling behavior is correlated over time.
arXiv Detail & Related papers (2020-06-02T17:51:11Z) - Understanding and Mitigating the Tradeoff Between Robustness and
Accuracy [88.51943635427709]
Adversarial training augments the training set with perturbations to improve the robust error.
We show that the standard error could increase even when the augmented perturbations have noiseless observations from the optimal linear predictor.
arXiv Detail & Related papers (2020-02-25T08:03:01Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.