Related papers: Deep Learning-Enhanced Preconditioning for Efficient Conjugate Gradient Solvers in Large-Scale PDE Systems

Related papers

Multi-Dimensional Visual Data Recovery: Scale-Aware Tensor Modeling and Accelerated Randomized Computation [51.65236537605077]
We propose a new type of network compression optimization technique, fully randomized tensor network compression (FCTN)<n>FCTN has significant advantages in correlation characterization and transpositional in algebra, and has notable achievements in multi-dimensional data processing and analysis.<n>We derive efficient algorithms with guarantees to solve the formulated models.
arXiv Detail & Related papers (2026-02-13T14:56:37Z)
Gradient Descent as a Perceptron Algorithm: Understanding Dynamics and Implicit Acceleration [67.12978375116599]
We show that the steps of gradient descent (GD) reduce to those of generalized perceptron algorithms.<n>This helps explain the optimization dynamics and the implicit acceleration phenomenon observed in neural networks.
arXiv Detail & Related papers (2025-12-12T14:16:35Z)
Tensor Gaussian Processes: Efficient Solvers for Nonlinear PDEs [16.38975732337055]
TGPS is a machine learning solver for partial differential equations.<n>It reduces the task to learning a collection of one-dimensional GPs.<n>It achieves superior accuracy and efficiency compared to existing approaches.
arXiv Detail & Related papers (2025-10-15T17:23:21Z)
Physics-informed neural networks for high-dimensional solutions and snaking bifurcations in nonlinear lattices [0.0]
This paper introduces a framework based on physics-informed neural networks (PINNs) for addressing key challenges in nonlinear lattices.<n>We first employ PINNs to approximate solutions of nonlinear systems arising from lattice models, using the Levenberg-Marquardt algorithm.<n>We then extend the method by coupling PINNs with a continuation approach to compute snaking bifurcation diagrams.<n>For linear stability analysis, we adapt PINNs to compute eigenvectors, introducing output constraints to enforce positivity, in line with Sturm-Liouville theory.
arXiv Detail & Related papers (2025-07-13T20:41:55Z)
Scalable Gaussian Processes: Advances in Iterative Methods and Pathwise Conditioning [0.26107298043931204]
This dissertation focuses on the powerful combination of iterative methods and pathwise conditioning to develop methodological contributions.<n>By combining these two techniques, expensive computations are expressed as solutions to systems of linear equations and obtained by leveraging iterative linear system solvers.<n>This drastically reduces memory requirements, facilitating application to significantly larger amounts of data, and introduces matrix multiplication as the main computational operation.
arXiv Detail & Related papers (2025-07-09T13:39:37Z)
Accelerating Natural Gradient Descent for PINNs with Randomized Numerical Linear Algebra [0.0]
Natural Gradient Descent (NGD) has emerged as a promising optimization algorithm for training neural network-based solvers for partial differential equations (PDEs)<n>We extend matrix-free NGD to broader classes of problems than previously considered and propose the use of Randomized Nystr"om preconditioning to accelerate convergence of the inner CG solver.<n>The resulting algorithm demonstrates substantial performance improvements over existing NGD-based methods on a range of PDE problems discretized using neural networks.
arXiv Detail & Related papers (2025-05-16T19:00:40Z)
Outlier-aware Tensor Robust Principal Component Analysis with Self-guided Data Augmentation [21.981038455329013]
We propose a self-guided data augmentation approach that employs adaptive weighting to suppress outlier influence. We show the improvements in both accuracy and computational efficiency compared to state-of-the-art methods.
arXiv Detail & Related papers (2025-04-25T13:03:35Z)
Linearly Convergent Mixup Learning [0.0]
We present two novel algorithms that extend to a broader range of binary classification models. Unlike gradient-based approaches, our algorithms do not require hyper parameters like learning rates, simplifying their implementation and optimization. Our algorithms achieve faster convergence to the optimal solution compared to descent gradient approaches, and that mixup data augmentation consistently improves the predictive performance across various loss functions.
arXiv Detail & Related papers (2025-01-14T02:33:40Z)
TensorGRaD: Tensor Gradient Robust Decomposition for Memory-Efficient Neural Operator Training [91.8932638236073]
We introduce textbfTensorGRaD, a novel method that directly addresses the memory challenges associated with large-structured weights.<n>We show that sparseGRaD reduces total memory usage by over $50%$ while maintaining and sometimes even improving accuracy.
arXiv Detail & Related papers (2025-01-04T20:51:51Z)
Adaptive Error-Bounded Hierarchical Matrices for Efficient Neural Network Compression [0.0]
This paper introduces a dynamic, error-bounded hierarchical matrix (H-matrix) compression method tailored for Physics-Informed Neural Networks (PINNs) The proposed approach reduces the computational complexity and memory demands of large-scale physics-based models while preserving the essential properties of the Neural Tangent Kernel (NTK) Empirical results demonstrate that this technique outperforms traditional compression methods, such as Singular Value Decomposition (SVD), pruning, and quantization, by maintaining high accuracy and improving generalization capabilities.
arXiv Detail & Related papers (2024-09-11T05:55:51Z)
Adaptive Federated Learning Over the Air [108.62635460744109]
We propose a federated version of adaptive gradient methods, particularly AdaGrad and Adam, within the framework of over-the-air model training. Our analysis shows that the AdaGrad-based training algorithm converges to a stationary point at the rate of $mathcalO( ln(T) / T 1 - frac1alpha ).
arXiv Detail & Related papers (2024-03-11T09:10:37Z)
An Optimization-based Deep Equilibrium Model for Hyperspectral Image Deconvolution with Convergence Guarantees [71.57324258813675]
We propose a novel methodology for addressing the hyperspectral image deconvolution problem. A new optimization problem is formulated, leveraging a learnable regularizer in the form of a neural network. The derived iterative solver is then expressed as a fixed-point calculation problem within the Deep Equilibrium framework.
arXiv Detail & Related papers (2023-06-10T08:25:16Z)
Neural incomplete factorization: learning preconditioners for the conjugate gradient method [2.899792823251184]
We develop a data-driven approach to accelerate the generation of effective preconditioners. We replace the typically hand-engineered preconditioners by the output of graph neural networks. Our method generates an incomplete factorization of the matrix and is, therefore, referred to as neural incomplete factorization (NeuralIF)
arXiv Detail & Related papers (2023-05-25T11:45:46Z)
Implicit Stochastic Gradient Descent for Training Physics-informed Neural Networks [51.92362217307946]
Physics-informed neural networks (PINNs) have effectively been demonstrated in solving forward and inverse differential equation problems. PINNs are trapped in training failures when the target functions to be approximated exhibit high-frequency or multi-scale features. In this paper, we propose to employ implicit gradient descent (ISGD) method to train PINNs for improving the stability of training process.
arXiv Detail & Related papers (2023-03-03T08:17:47Z)
On Robust Numerical Solver for ODE via Self-Attention Mechanism [82.95493796476767]
We explore training efficient and robust AI-enhanced numerical solvers with a small data size by mitigating intrinsic noise disturbances. We first analyze the ability of the self-attention mechanism to regulate noise in supervised learning and then propose a simple-yet-effective numerical solver, Attr, which introduces an additive self-attention mechanism to the numerical solution of differential equations.
arXiv Detail & Related papers (2023-02-05T01:39:21Z)
Learning Subgrid-scale Models with Neural Ordinary Differential Equations [0.39160947065896795]
We propose a new approach to learning the subgrid-scale model when simulating partial differential equations (PDEs) In this approach neural networks are used to learn the coarse- to fine-grid map, which can be viewed as subgrid-scale parameterization. Our method inherits the advantages of NODEs and can be used to parameterize subgrid scales, approximate coupling operators, and improve the efficiency of low-order solvers.
arXiv Detail & Related papers (2022-12-20T02:45:09Z)
An Accelerated Doubly Stochastic Gradient Method with Faster Explicit Model Identification [97.28167655721766]
We propose a novel doubly accelerated gradient descent (ADSGD) method for sparsity regularized loss minimization problems. We first prove that ADSGD can achieve a linear convergence rate and lower overall computational complexity.
arXiv Detail & Related papers (2022-08-11T22:27:22Z)
Cogradient Descent for Dependable Learning [64.02052988844301]
We propose a dependable learning based on Cogradient Descent (CoGD) algorithm to address the bilinear optimization problem. CoGD is introduced to solve bilinear problems when one variable is with sparsity constraint. It can also be used to decompose the association of features and weights, which further generalizes our method to better train convolutional neural networks (CNNs)
arXiv Detail & Related papers (2021-06-20T04:28:20Z)
Partitioning sparse deep neural networks for scalable training and inference [8.282177703075453]
State-of-the-art deep neural networks (DNNs) have significant computational and data management requirements. Sparsification and pruning methods are shown to be effective in removing a large fraction of connections in DNNs. The resulting sparse networks present unique challenges to further improve the computational efficiency of training and inference in deep learning.
arXiv Detail & Related papers (2021-04-23T20:05:52Z)
Multipole Graph Neural Operator for Parametric Partial Differential Equations [57.90284928158383]
One of the main challenges in using deep learning-based methods for simulating physical systems is formulating physics-based data. We propose a novel multi-level graph neural network framework that captures interaction at all ranges with only linear complexity. Experiments confirm our multi-graph network learns discretization-invariant solution operators to PDEs and can be evaluated in linear time.
arXiv Detail & Related papers (2020-06-16T21:56:22Z)

This list is automatically generated from the titles and abstracts of the papers in this site.