Neural incomplete factorization: learning preconditioners for the
conjugate gradient method
- URL: http://arxiv.org/abs/2305.16368v2
- Date: Mon, 5 Feb 2024 16:20:08 GMT
- Title: Neural incomplete factorization: learning preconditioners for the
conjugate gradient method
- Authors: Paul H\"ausner, Ozan \"Oktem, Jens Sj\"olund
- Abstract summary: We develop a data-driven approach to replace the typically hand-engineered algorithms with neural networks.
Our method generates an incomplete factorization of the matrix and is, therefore, referred to as neural incomplete factorization (NeuralIF)
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Finding suitable preconditioners to accelerate iterative solution methods,
such as the conjugate gradient method, is an active area of research. In this
paper, we develop a computationally efficient data-driven approach to replace
the typically hand-engineered algorithms with neural networks. Optimizing the
condition number of the linear system directly is computationally infeasible.
Instead, our method generates an incomplete factorization of the matrix and is,
therefore, referred to as neural incomplete factorization (NeuralIF). For
efficient training, we utilize a stochastic approximation of the Frobenius loss
which only requires matrix-vector multiplications. At the core of our method is
a novel messagepassing block, inspired by sparse matrix theory, that aligns
with the objective of finding a sparse factorization of the matrix. By
replacing conventional preconditioners used within the conjugate gradient
method by data-driven models based on graph neural networks, we accelerate the
iterative solving procedure. We evaluate our proposed method on both a
synthetic and a real-world problem arising from scientific computing and show
its ability to reduce the solving time while remaining computationally
efficient.
Related papers
- Learning incomplete factorization preconditioners for GMRES [1.1519724914285523]
In this paper, we develop a data-driven approach to generate incomplete LU factorizations of large-scale sparse matrices.
The learned approximate factorization is utilized as a preconditioner for the corresponding linear equation system in the GMRES method.
We replace the typically hand-engineered algorithms with a graph neural network based approach that is trained against data.
arXiv Detail & Related papers (2024-09-12T17:55:44Z) - Generating gradients in the energy landscape using rectified linear type
cost functions for efficiently solving 0/1 matrix factorization in Simulated
Annealing [7.339479909020814]
We propose a method to facilitate the solution process by applying a gradient to the energy landscape.
We also propose a method to quickly obtain a solution by updating the cost function's gradient during the search process.
arXiv Detail & Related papers (2023-12-27T04:19:47Z) - Stochastic Gradient Descent for Gaussian Processes Done Right [86.83678041846971]
We show that when emphdone right -- by which we mean using specific insights from optimisation and kernel communities -- gradient descent is highly effective.
We introduce a emphstochastic dual descent algorithm, explain its design in an intuitive manner and illustrate the design choices.
Our method places Gaussian process regression on par with state-of-the-art graph neural networks for molecular binding affinity prediction.
arXiv Detail & Related papers (2023-10-31T16:15:13Z) - Stochastic Optimization for Non-convex Problem with Inexact Hessian
Matrix, Gradient, and Function [99.31457740916815]
Trust-region (TR) and adaptive regularization using cubics have proven to have some very appealing theoretical properties.
We show that TR and ARC methods can simultaneously provide inexact computations of the Hessian, gradient, and function values.
arXiv Detail & Related papers (2023-10-18T10:29:58Z) - Nystrom Method for Accurate and Scalable Implicit Differentiation [25.29277451838466]
We show that the Nystrom method consistently achieves comparable or even superior performance to other approaches.
The proposed method avoids numerical instability and can be efficiently computed in matrix operations without iterations.
arXiv Detail & Related papers (2023-02-20T02:37:26Z) - Matrix Completion via Non-Convex Relaxation and Adaptive Correlation
Learning [90.8576971748142]
We develop a novel surrogate that can be optimized by closed-form solutions.
We exploit upperwise correlation for completion, and thus an adaptive correlation learning model.
arXiv Detail & Related papers (2022-03-04T08:50:50Z) - Memory-Efficient Backpropagation through Large Linear Layers [107.20037639738433]
In modern neural networks like Transformers, linear layers require significant memory to store activations during backward pass.
This study proposes a memory reduction approach to perform backpropagation through linear layers.
arXiv Detail & Related papers (2022-01-31T13:02:41Z) - Converting ADMM to a Proximal Gradient for Convex Optimization Problems [4.56877715768796]
In sparse estimation, such as fused lasso and convex clustering, we apply either the proximal gradient method or the alternating direction method of multipliers (ADMM) to solve the problem.
This paper proposes a general method for converting the ADMM solution to the proximal gradient method, assuming that the constraints and objectives are strongly convex.
We show by numerical experiments that we can obtain a significant improvement in terms of efficiency.
arXiv Detail & Related papers (2021-04-22T07:41:12Z) - Solving weakly supervised regression problem using low-rank manifold
regularization [77.34726150561087]
We solve a weakly supervised regression problem.
Under "weakly" we understand that for some training points the labels are known, for some unknown, and for others uncertain due to the presence of random noise or other reasons such as lack of resources.
In the numerical section, we applied the suggested method to artificial and real datasets using Monte-Carlo modeling.
arXiv Detail & Related papers (2021-04-13T23:21:01Z) - Understanding Implicit Regularization in Over-Parameterized Single Index
Model [55.41685740015095]
We design regularization-free algorithms for the high-dimensional single index model.
We provide theoretical guarantees for the induced implicit regularization phenomenon.
arXiv Detail & Related papers (2020-07-16T13:27:47Z) - Accelerating Ill-Conditioned Low-Rank Matrix Estimation via Scaled
Gradient Descent [34.0533596121548]
Low-rank matrix estimation converges convex problem that finds numerous applications in signal processing, machine learning and imaging science.
We show that ScaledGD achieves the best of the best in terms of the number of the low-rank matrix.
Our analysis is also applicable to general loss that are similar to low-rank gradient descent.
arXiv Detail & Related papers (2020-05-18T17:17:16Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.