Learning from Linear Algebra: A Graph Neural Network Approach to Preconditioner Design for Conjugate Gradient Solvers
- URL: http://arxiv.org/abs/2405.15557v3
- Date: Mon, 03 Feb 2025 13:54:57 GMT
- Title: Learning from Linear Algebra: A Graph Neural Network Approach to Preconditioner Design for Conjugate Gradient Solvers
- Authors: Vladislav Trifonov, Alexander Rudikov, Oleg Iliev, Yuri M. Laevsky, Ivan Oseledets, Ekaterina Muravleva,
- Abstract summary: We train GNNs to obtain preconditioners that reduce the condition number of the system more significantly than classical preconditioners.
Our approach outperforms both classical and neural network-based methods for an important class of parametric partial differential equations.
- Score: 40.6591136324878
- License:
- Abstract: Large linear systems are ubiquitous in modern computational science and engineering. The main recipe for solving them is the use of Krylov subspace iterative methods with well-designed preconditioners. Recently, GNNs have been shown to be a promising tool for designing preconditioners to reduce the overall computational cost of iterative methods by constructing them more efficiently than with classical linear algebra techniques. Preconditioners designed with these approaches cannot outperform those designed with classical methods in terms of the number of iterations in CG. In our work, we recall well-established preconditioners from linear algebra and use them as a starting point for training the GNN to obtain preconditioners that reduce the condition number of the system more significantly than classical preconditioners. Numerical experiments show that our approach outperforms both classical and neural network-based methods for an important class of parametric partial differential equations. We also provide a heuristic justification for the loss function used and show that preconditioners obtained by learning with this loss function reduce the condition number in a more desirable way for CG.
Related papers
- On The Concurrence of Layer-wise Preconditioning Methods and Provable Feature Learning [22.486361028522374]
We show that layer-wise preconditioning methods are provably necessary from a statistical perspective.
We show that SGD is a suboptimal feature when extending beyond ideal isotropic inputs.
We show that standard tools like Adam preconditioning and batch-norm only mildly mitigate these issues.
arXiv Detail & Related papers (2025-02-03T19:08:32Z) - Learning incomplete factorization preconditioners for GMRES [1.1519724914285523]
We train a graph neural network to approximate the matrix factorization directly.
Applying a graph neural network architecture allows us to ensure that the output itself is sparse.
We show their effectiveness in decreasing the number of GMRES iterations and improving the spectral properties on synthetic data.
arXiv Detail & Related papers (2024-09-12T17:55:44Z) - Graph Neural Preconditioners for Iterative Solutions of Sparse Linear Systems [5.083469153675402]
We propose using graph neural networks as a general-purpose preconditioner.
They show attractive performance for many problems and can be used when the mainstream preconditioners perform poorly.
arXiv Detail & Related papers (2024-06-02T17:18:41Z) - Preconditioning for a Variational Quantum Linear Solver [0.0]
We numerically demonstrate a notable reduction in the required ansatz depth, demonstrating that preconditioning is useful for quantum algorithms.
Our findings suggest that combining classical computing techniques, such as preconditioning, with quantum algorithms can significantly enhance the performance of NISQ algorithms.
arXiv Detail & Related papers (2023-12-25T08:50:22Z) - The Convex Landscape of Neural Networks: Characterizing Global Optima
and Stationary Points via Lasso Models [75.33431791218302]
Deep Neural Network Network (DNN) models are used for programming purposes.
In this paper we examine the use of convex neural recovery models.
We show that all the stationary non-dimensional objective objective can be characterized as the standard a global subsampled convex solvers program.
We also show that all the stationary non-dimensional objective objective can be characterized as the standard a global subsampled convex solvers program.
arXiv Detail & Related papers (2023-12-19T23:04:56Z) - An Optimization-based Deep Equilibrium Model for Hyperspectral Image
Deconvolution with Convergence Guarantees [71.57324258813675]
We propose a novel methodology for addressing the hyperspectral image deconvolution problem.
A new optimization problem is formulated, leveraging a learnable regularizer in the form of a neural network.
The derived iterative solver is then expressed as a fixed-point calculation problem within the Deep Equilibrium framework.
arXiv Detail & Related papers (2023-06-10T08:25:16Z) - Neural incomplete factorization: learning preconditioners for the conjugate gradient method [2.899792823251184]
We develop a data-driven approach to accelerate the generation of effective preconditioners.
We replace the typically hand-engineered preconditioners by the output of graph neural networks.
Our method generates an incomplete factorization of the matrix and is, therefore, referred to as neural incomplete factorization (NeuralIF)
arXiv Detail & Related papers (2023-05-25T11:45:46Z) - GradInit: Learning to Initialize Neural Networks for Stable and
Efficient Training [59.160154997555956]
We present GradInit, an automated and architecture method for initializing neural networks.
It is based on a simple agnostic; the variance of each network layer is adjusted so that a single step of SGD or Adam results in the smallest possible loss value.
It also enables training the original Post-LN Transformer for machine translation without learning rate warmup.
arXiv Detail & Related papers (2021-02-16T11:45:35Z) - A Meta-Learning Approach to the Optimal Power Flow Problem Under
Topology Reconfigurations [69.73803123972297]
We propose a DNN-based OPF predictor that is trained using a meta-learning (MTL) approach.
The developed OPF-predictor is validated through simulations using benchmark IEEE bus systems.
arXiv Detail & Related papers (2020-12-21T17:39:51Z) - LQF: Linear Quadratic Fine-Tuning [114.3840147070712]
We present the first method for linearizing a pre-trained model that achieves comparable performance to non-linear fine-tuning.
LQF consists of simple modifications to the architecture, loss function and optimization typically used for classification.
arXiv Detail & Related papers (2020-12-21T06:40:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.