Related papers: On Robust Numerical Solver for ODE via Self-Attention Mechanism

On Robust Numerical Solver for ODE via Self-Attention Mechanism

URL: http://arxiv.org/abs/2302.10184v1
Date: Sun, 5 Feb 2023 01:39:21 GMT
Title: On Robust Numerical Solver for ODE via Self-Attention Mechanism
Authors: Zhongzhan Huang, Mingfu Liang and Liang Lin
Abstract summary: We explore training efficient and robust AI-enhanced numerical solvers with a small data size by mitigating intrinsic noise disturbances. We first analyze the ability of the self-attention mechanism to regulate noise in supervised learning and then propose a simple-yet-effective numerical solver, Attr, which introduces an additive self-attention mechanism to the numerical solution of differential equations.
Score: 82.95493796476767
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: With the development of deep learning techniques, AI-enhanced numerical solvers are expected to become a new paradigm for solving differential equations due to their versatility and effectiveness in alleviating the accuracy-speed trade-off in traditional numerical solvers. However, this paradigm still inevitably requires a large amount of high-quality data, whose acquisition is often very expensive in natural science and engineering problems. Therefore, in this paper, we explore training efficient and robust AI-enhanced numerical solvers with a small data size by mitigating intrinsic noise disturbances. We first analyze the ability of the self-attention mechanism to regulate noise in supervised learning and then propose a simple-yet-effective numerical solver, AttSolver, which introduces an additive self-attention mechanism to the numerical solution of differential equations based on the dynamical system perspective of the residual neural network. Our results on benchmarks, ranging from high-dimensional problems to chaotic systems, demonstrate the effectiveness of AttSolver in generally improving the performance of existing traditional numerical solvers without any elaborated model crafting. Finally, we analyze the convergence, generalization, and robustness of the proposed method experimentally and theoretically.

Related papers

A Simultaneous Approach for Training Neural Differential-Algebraic Systems of Equations [0.4935512063616847]
We study neural differential-algebraic systems of equations (DAEs), where some unknown relationships are learned from data. We apply the simultaneous approach to neural DAE problems, resulting in a fully discretized nonlinear optimization problem. We achieve promising results in terms of accuracy, model generalizability and computational cost, across different problem settings.
arXiv Detail & Related papers (2025-04-07T01:26:55Z)
Advanced Physics-Informed Neural Network with Residuals for Solving Complex Integral Equations [0.13499500088995461]
RISN is a novel neural network architecture designed to solve a wide range of integral and integro-differential equations. RISN integrates residual connections with high-accuracy numerical methods. We demonstrate that RISN consistently outperforms classical PINNs.
arXiv Detail & Related papers (2025-01-22T19:47:03Z)
A Model-Constrained Discontinuous Galerkin Network (DGNet) for Compressible Euler Equations with Out-of-Distribution Generalization [0.0]
We develop a model-constrained discontinuous Galerkin Network (DGNet) approach. The core of DGNet is the synergy of several key strategies. We present comprehensive numerical results for 1D and 2D compressible Euler equation problems.
arXiv Detail & Related papers (2024-09-27T01:13:38Z)
Harnessing physics-informed operators for high-dimensional reliability analysis problems [0.8192907805418583]
Reliability analysis is a formidable task, particularly in systems with a large number of parameters. Conventional methods for quantifying reliability often rely on extensive simulations or experimental data. We show that physics-informed operator can seamlessly solve high-dimensional reliability analysis problems with reasonable accuracy.
arXiv Detail & Related papers (2024-09-07T04:52:03Z)
Adversarial Learning for Neural PDE Solvers with Sparse Data [4.226449585713182]
This study introduces a universal learning strategy for neural network PDEs, named Systematic Model Augmentation for Robust Training. By focusing on challenging and improving the model's weaknesses, SMART reduces generalization error during training under data-scarce conditions.
arXiv Detail & Related papers (2024-09-04T04:18:25Z)
Physics Informed Kolmogorov-Arnold Neural Networks for Dynamical Analysis via Efficent-KAN and WAV-KAN [0.12045539806824918]
We implement the Physics-Informed Kolmogorov-Arnold Neural Networks (PIKAN) through efficient-KAN and WAV-KAN. PIKAN demonstrates superior performance compared to conventional deep neural networks, achieving the same level of accuracy with fewer layers and reduced computational overhead.
arXiv Detail & Related papers (2024-07-25T20:14:58Z)
Mechanistic Neural Networks for Scientific Machine Learning [58.99592521721158]
We present Mechanistic Neural Networks, a neural network design for machine learning applications in the sciences. It incorporates a new Mechanistic Block in standard architectures to explicitly learn governing differential equations as representations. Central to our approach is a novel Relaxed Linear Programming solver (NeuRLP) inspired by a technique that reduces solving linear ODEs to solving linear programs.
arXiv Detail & Related papers (2024-02-20T15:23:24Z)
Machine Learning Insides OptVerse AI Solver: Design Principles and Applications [74.67495900436728]
We present a comprehensive study on the integration of machine learning (ML) techniques into Huawei Cloud's OptVerse AI solver. We showcase our methods for generating complex SAT and MILP instances utilizing generative models that mirror multifaceted structures of real-world problem. We detail the incorporation of state-of-the-art parameter tuning algorithms which markedly elevate solver performance.
arXiv Detail & Related papers (2024-01-11T15:02:15Z)
Physical Information Neural Networks for Solving High-index Differential-algebraic Equation Systems Based on Radau Methods [10.974537885042613]
We propose a PINN computational framework, combined Radau IIA numerical method with a neural network structure via the attention mechanisms, to directly solve high-index DAEs. Our method exhibits excellent computational accuracy and strong generalization capabilities, providing a feasible approach for the high-precision solution of larger-scale DAEs.
arXiv Detail & Related papers (2023-10-19T15:57:10Z)
An Optimization-based Deep Equilibrium Model for Hyperspectral Image Deconvolution with Convergence Guarantees [71.57324258813675]
We propose a novel methodology for addressing the hyperspectral image deconvolution problem. A new optimization problem is formulated, leveraging a learnable regularizer in the form of a neural network. The derived iterative solver is then expressed as a fixed-point calculation problem within the Deep Equilibrium framework.
arXiv Detail & Related papers (2023-06-10T08:25:16Z)
Implicit Stochastic Gradient Descent for Training Physics-informed Neural Networks [51.92362217307946]
Physics-informed neural networks (PINNs) have effectively been demonstrated in solving forward and inverse differential equation problems. PINNs are trapped in training failures when the target functions to be approximated exhibit high-frequency or multi-scale features. In this paper, we propose to employ implicit gradient descent (ISGD) method to train PINNs for improving the stability of training process.
arXiv Detail & Related papers (2023-03-03T08:17:47Z)
Tunable Complexity Benchmarks for Evaluating Physics-Informed Neural Networks on Coupled Ordinary Differential Equations [64.78260098263489]
In this work, we assess the ability of physics-informed neural networks (PINNs) to solve increasingly-complex coupled ordinary differential equations (ODEs) We show that PINNs eventually fail to produce correct solutions to these benchmarks as their complexity increases. We identify several reasons why this may be the case, including insufficient network capacity, poor conditioning of the ODEs, and high local curvature, as measured by the Laplacian of the PINN loss.
arXiv Detail & Related papers (2022-10-14T15:01:32Z)
On Fast Simulation of Dynamical System with Neural Vector Enhanced Numerical Solver [59.13397937903832]
We introduce a deep learning-based corrector called Neural Vector (NeurVec) NeurVec can compensate for integration errors and enable larger time step sizes in simulations. Our experiments on a variety of complex dynamical system benchmarks demonstrate that NeurVec exhibits remarkable generalization capability.
arXiv Detail & Related papers (2022-08-07T09:02:18Z)
Learning Sparse Nonlinear Dynamics via Mixed-Integer Optimization [3.7565501074323224]
We propose an exact formulation of the SINDyDy problem using mixed-integer optimization (MIO) to solve the sparsity constrained regression problem to provable optimality in seconds. We illustrate the dramatic improvement of our approach in accurate model discovery while being more sample efficient, robust to noise, and flexible in accommodating physical constraints.
arXiv Detail & Related papers (2022-06-01T01:43:45Z)
Interfacing Finite Elements with Deep Neural Operators for Fast Multiscale Modeling of Mechanics Problems [4.280301926296439]
In this work, we explore the idea of multiscale modeling with machine learning and employ DeepONet, a neural operator, as an efficient surrogate of the expensive solver. DeepONet is trained offline using data acquired from the fine solver for learning the underlying and possibly unknown fine-scale dynamics. We present various benchmarks to assess accuracy and speedup, and in particular we develop a coupling algorithm for a time-dependent problem.
arXiv Detail & Related papers (2022-02-25T20:46:08Z)
Generalization of Neural Combinatorial Solvers Through the Lens of Adversarial Robustness [68.97830259849086]
Most datasets only capture a simpler subproblem and likely suffer from spurious features. We study adversarial robustness - a local generalization property - to reveal hard, model-specific instances and spurious features. Unlike in other applications, where perturbation models are designed around subjective notions of imperceptibility, our perturbation models are efficient and sound. Surprisingly, with such perturbations, a sufficiently expressive neural solver does not suffer from the limitations of the accuracy-robustness trade-off common in supervised learning.
arXiv Detail & Related papers (2021-10-21T07:28:11Z)
Efficient Model-Based Multi-Agent Mean-Field Reinforcement Learning [89.31889875864599]
We propose an efficient model-based reinforcement learning algorithm for learning in multi-agent systems. Our main theoretical contributions are the first general regret bounds for model-based reinforcement learning for MFC. We provide a practical parametrization of the core optimization problem.
arXiv Detail & Related papers (2021-07-08T18:01:02Z)
Fast Distributionally Robust Learning with Variance Reduced Min-Max Optimization [85.84019017587477]
Distributionally robust supervised learning is emerging as a key paradigm for building reliable machine learning systems for real-world applications. Existing algorithms for solving Wasserstein DRSL involve solving complex subproblems or fail to make use of gradients. We revisit Wasserstein DRSL through the lens of min-max optimization and derive scalable and efficiently implementable extra-gradient algorithms.
arXiv Detail & Related papers (2021-04-27T16:56:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.