Related papers: Towards Universal Solvers: Using PGD Attack in Active Learning to Increase Generalizability of Neural Operators as Knowledge Distillation from Numerical PDE Solvers

Towards Universal Solvers: Using PGD Attack in Active Learning to Increase Generalizability of Neural Operators as Knowledge Distillation from Numerical PDE Solvers

URL: http://arxiv.org/abs/2510.18989v1
Date: Tue, 21 Oct 2025 18:13:05 GMT
Title: Towards Universal Solvers: Using PGD Attack in Active Learning to Increase Generalizability of Neural Operators as Knowledge Distillation from Numerical PDE Solvers
Authors: Yifei Sun,
Abstract summary: PDE solvers require fine space-time discretizations and local linearizations, leading to high memory cost and slow runtimes.<n>We propose an adversarial teacher-student distillation framework in which a differentiable numerical solver supervises a compact neural operator.<n>Experiments on Burgers and Navier-Stokes systems demonstrate that adversarial distillation substantially improves OOD while preserving the low parameter cost and fast inference of neural operators.
Score: 3.780792537808271
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Nonlinear PDE solvers require fine space-time discretizations and local linearizations, leading to high memory cost and slow runtimes. Neural operators such as FNOs and DeepONets offer fast single-shot inference by learning function-to-function mappings and truncating high-frequency components, but they suffer from poor out-of-distribution (OOD) generalization, often failing on inputs outside the training distribution. We propose an adversarial teacher-student distillation framework in which a differentiable numerical solver supervises a compact neural operator while a PGD-style active sampling loop searches for worst-case inputs under smoothness and energy constraints to expand the training set. Using differentiable spectral solvers enables gradient-based adversarial search and stabilizes sample mining. Experiments on Burgers and Navier-Stokes systems demonstrate that adversarial distillation substantially improves OOD robustness while preserving the low parameter cost and fast inference of neural operators.

Related papers

Adaptive recurrent flow map operator learning for reaction diffusion dynamics [0.9137554315375919]
We develop an operator learner with adaptive recurrent training (DDOL-ART) using a robust recurrent strategy with lightweight validation milestones.<n>DDOL-ART learns one-step operators that remain stable under long rollouts and generalize zero-shot to strong shifts.<n>It is several-fold faster than a physics-based numerical-loss operator learner (NLOL) under matched settings.
arXiv Detail & Related papers (2026-02-10T07:33:13Z)
NOWS: Neural Operator Warm Starts for Accelerating Iterative Solvers [1.8117099374299037]
Partial differential equations (PDEs) underpin quantitative descriptions across the physical sciences and engineering.<n>Data-driven surrogates can be strikingly fast but are often unreliable when applied outside their training distribution.<n>Here we introduce Neural Operator Warm Starts (NOWS), a hybrid strategy that harnesses learned solution operators to accelerate classical iterative solvers.
arXiv Detail & Related papers (2025-11-04T11:12:27Z)
How deep is your network? Deep vs. shallow learning of transfer operators [0.4473327661758546]
We propose a randomized neural network approach called RaNNDy for learning transfer operators and their spectral decompositions from data.<n>The main advantage is that without a noticeable reduction in accuracy, this approach significantly reduces the training time and resources.<n>We present results for different dynamical operators, including Koopman and Perron-Frobenius operators, which have important applications in analyzing the behavior of complex dynamical systems.
arXiv Detail & Related papers (2025-09-24T09:38:42Z)
Accelerating PDE Solvers with Equation-Recast Neural Operator Preconditioning [9.178290601589365]
Minimal-Data Parametric Neural Operator Preconditioning (MD-PNOP) is a new paradigm for accelerating parametric PDE solvers.<n>It recasts the residual from parameter deviation as additional source term, where trained neural operators can be used to refine the solution in an offline fashion.<n>It consistently achieves 50% reduction in computational time while maintaining full order fidelity for fixed-source, single-group eigenvalue, and multigroup coupled eigenvalue problems.
arXiv Detail & Related papers (2025-09-01T12:14:58Z)
Efficient Parametric SVD of Koopman Operator for Stochastic Dynamical Systems [51.54065545849027]
The Koopman operator provides a principled framework for analyzing nonlinear dynamical systems.<n>VAMPnet and DPNet have been proposed to learn the leading singular subspaces of the Koopman operator.<n>We propose a scalable and conceptually simple method for learning the top-$k$ singular functions of the Koopman operator.
arXiv Detail & Related papers (2025-07-09T18:55:48Z)
Decentralized Nonconvex Composite Federated Learning with Gradient Tracking and Momentum [78.27945336558987]
Decentralized server (DFL) eliminates reliance on client-client architecture.<n>Non-smooth regularization is often incorporated into machine learning tasks.<n>We propose a novel novel DNCFL algorithm to solve these problems.
arXiv Detail & Related papers (2025-04-17T08:32:25Z)
DeepONet Augmented by Randomized Neural Networks for Efficient Operator Learning in PDEs [5.84093922354671]
We propose RaNN-DeepONets, a hybrid architecture designed to balance accuracy and efficiency.<n>RaNN-DeepONets achieves comparable accuracy while reducing computational costs by orders of magnitude.<n>These results highlight the potential of RaNN-DeepONets as an efficient alternative for operator learning in PDE-based systems.
arXiv Detail & Related papers (2025-03-01T03:05:29Z)
Spectral-Refiner: Accurate Fine-Tuning of Spatiotemporal Fourier Neural Operator for Turbulent Flows [6.961408873053586]
Recent in operator-type neural networks have shown promising results in approximating Partial Differential Equations (PDEs)<n>These neural networks entail considerable training expenses, and may not always achieve the desired accuracy required in many scientific and engineering disciplines.
arXiv Detail & Related papers (2024-05-27T14:33:06Z)
Implicit Stochastic Gradient Descent for Training Physics-informed Neural Networks [51.92362217307946]
Physics-informed neural networks (PINNs) have effectively been demonstrated in solving forward and inverse differential equation problems. PINNs are trapped in training failures when the target functions to be approximated exhibit high-frequency or multi-scale features. In this paper, we propose to employ implicit gradient descent (ISGD) method to train PINNs for improving the stability of training process.
arXiv Detail & Related papers (2023-03-03T08:17:47Z)
Monte Carlo Neural PDE Solver for Learning PDEs via Probabilistic Representation [59.45669299295436]
We propose a Monte Carlo PDE solver for training unsupervised neural solvers.<n>We use the PDEs' probabilistic representation, which regards macroscopic phenomena as ensembles of random particles.<n>Our experiments on convection-diffusion, Allen-Cahn, and Navier-Stokes equations demonstrate significant improvements in accuracy and efficiency.
arXiv Detail & Related papers (2023-02-10T08:05:19Z)
Incorporating NODE with Pre-trained Neural Differential Operator for Learning Dynamics [73.77459272878025]
We propose to enhance the supervised signal in learning dynamics by pre-training a neural differential operator (NDO) NDO is pre-trained on a class of symbolic functions, and it learns the mapping between the trajectory samples of these functions to their derivatives. We provide theoretical guarantee on that the output of NDO can well approximate the ground truth derivatives by proper tuning the complexity of the library.
arXiv Detail & Related papers (2021-06-08T08:04:47Z)
Fast Distributionally Robust Learning with Variance Reduced Min-Max Optimization [85.84019017587477]
Distributionally robust supervised learning is emerging as a key paradigm for building reliable machine learning systems for real-world applications. Existing algorithms for solving Wasserstein DRSL involve solving complex subproblems or fail to make use of gradients. We revisit Wasserstein DRSL through the lens of min-max optimization and derive scalable and efficiently implementable extra-gradient algorithms.
arXiv Detail & Related papers (2021-04-27T16:56:09Z)
On Learning Rates and Schr\"odinger Operators [105.32118775014015]
We present a general theoretical analysis of the effect of the learning rate. We find that the learning rate tends to zero for a broad non- neural class functions.
arXiv Detail & Related papers (2020-04-15T09:52:37Z)

This list is automatically generated from the titles and abstracts of the papers in this site.