Related papers: Neural incomplete factorization: learning preconditioners for the conjugate gradient method

Neural incomplete factorization: learning preconditioners for the conjugate gradient method

URL: http://arxiv.org/abs/2305.16368v2
Date: Mon, 5 Feb 2024 16:20:08 GMT
Title: Neural incomplete factorization: learning preconditioners for the conjugate gradient method
Authors: Paul H\"ausner, Ozan \"Oktem, Jens Sj\"olund
Abstract summary: We develop a data-driven approach to replace the typically hand-engineered algorithms with neural networks. Our method generates an incomplete factorization of the matrix and is, therefore, referred to as neural incomplete factorization (NeuralIF)
Score: 0.0
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Finding suitable preconditioners to accelerate iterative solution methods, such as the conjugate gradient method, is an active area of research. In this paper, we develop a computationally efficient data-driven approach to replace the typically hand-engineered algorithms with neural networks. Optimizing the condition number of the linear system directly is computationally infeasible. Instead, our method generates an incomplete factorization of the matrix and is, therefore, referred to as neural incomplete factorization (NeuralIF). For efficient training, we utilize a stochastic approximation of the Frobenius loss which only requires matrix-vector multiplications. At the core of our method is a novel messagepassing block, inspired by sparse matrix theory, that aligns with the objective of finding a sparse factorization of the matrix. By replacing conventional preconditioners used within the conjugate gradient method by data-driven models based on graph neural networks, we accelerate the iterative solving procedure. We evaluate our proposed method on both a synthetic and a real-world problem arising from scientific computing and show its ability to reduce the solving time while remaining computationally efficient.

Related papers

Preconditioned Gradient Descent for Over-Parameterized Nonconvex Matrix Factorization [19.32160757444549]
In practical instances of nonspecified matrix factorization, the rank of the true solutionrstar$ is often unknown, so the rankr$ of the model can be singular as $r>rstar$. We propose an inexpensive suber for matrix sensing variant non matrix factorization that restores the convergence factor back to linear, even in agnosticized case. Our numerical experiments find that PrecGD works equally well in restoring the convergence of other variants non matrix factorization.
arXiv Detail & Related papers (2025-04-13T20:06:49Z)
A Non-Convex Optimization Strategy for Computing Convex-Roof Entanglement [0.0]
We develop a numerical methodology for the computation of entanglement measures for mixed states. We find that the method works well enough to reliably reproduce curves.
arXiv Detail & Related papers (2024-12-13T14:29:02Z)
Learning incomplete factorization preconditioners for GMRES [1.1519724914285523]
In this paper, we develop a data-driven approach to generate incomplete LU factorizations of large-scale sparse matrices. The learned approximate factorization is utilized as a preconditioner for the corresponding linear equation system in the GMRES method. We replace the typically hand-engineered algorithms with a graph neural network based approach that is trained against data.
arXiv Detail & Related papers (2024-09-12T17:55:44Z)
Generating gradients in the energy landscape using rectified linear type cost functions for efficiently solving 0/1 matrix factorization in Simulated Annealing [7.339479909020814]
We propose a method to facilitate the solution process by applying a gradient to the energy landscape. We also propose a method to quickly obtain a solution by updating the cost function's gradient during the search process.
arXiv Detail & Related papers (2023-12-27T04:19:47Z)
Stochastic Gradient Descent for Gaussian Processes Done Right [86.83678041846971]
We show that when emphdone right -- by which we mean using specific insights from optimisation and kernel communities -- gradient descent is highly effective. We introduce a emphstochastic dual descent algorithm, explain its design in an intuitive manner and illustrate the design choices. Our method places Gaussian process regression on par with state-of-the-art graph neural networks for molecular binding affinity prediction.
arXiv Detail & Related papers (2023-10-31T16:15:13Z)
Stochastic Optimization for Non-convex Problem with Inexact Hessian Matrix, Gradient, and Function [99.31457740916815]
Trust-region (TR) and adaptive regularization using cubics have proven to have some very appealing theoretical properties. We show that TR and ARC methods can simultaneously provide inexact computations of the Hessian, gradient, and function values.
arXiv Detail & Related papers (2023-10-18T10:29:58Z)
Nystrom Method for Accurate and Scalable Implicit Differentiation [25.29277451838466]
We show that the Nystrom method consistently achieves comparable or even superior performance to other approaches. The proposed method avoids numerical instability and can be efficiently computed in matrix operations without iterations.
arXiv Detail & Related papers (2023-02-20T02:37:26Z)
Matrix Completion via Non-Convex Relaxation and Adaptive Correlation Learning [90.8576971748142]
We develop a novel surrogate that can be optimized by closed-form solutions. We exploit upperwise correlation for completion, and thus an adaptive correlation learning model.
arXiv Detail & Related papers (2022-03-04T08:50:50Z)
Memory-Efficient Backpropagation through Large Linear Layers [107.20037639738433]
In modern neural networks like Transformers, linear layers require significant memory to store activations during backward pass. This study proposes a memory reduction approach to perform backpropagation through linear layers.
arXiv Detail & Related papers (2022-01-31T13:02:41Z)
Converting ADMM to a Proximal Gradient for Convex Optimization Problems [4.56877715768796]
In sparse estimation, such as fused lasso and convex clustering, we apply either the proximal gradient method or the alternating direction method of multipliers (ADMM) to solve the problem. This paper proposes a general method for converting the ADMM solution to the proximal gradient method, assuming that the constraints and objectives are strongly convex. We show by numerical experiments that we can obtain a significant improvement in terms of efficiency.
arXiv Detail & Related papers (2021-04-22T07:41:12Z)
Solving weakly supervised regression problem using low-rank manifold regularization [77.34726150561087]
We solve a weakly supervised regression problem. Under "weakly" we understand that for some training points the labels are known, for some unknown, and for others uncertain due to the presence of random noise or other reasons such as lack of resources. In the numerical section, we applied the suggested method to artificial and real datasets using Monte-Carlo modeling.
arXiv Detail & Related papers (2021-04-13T23:21:01Z)
Understanding Implicit Regularization in Over-Parameterized Single Index Model [55.41685740015095]
We design regularization-free algorithms for the high-dimensional single index model. We provide theoretical guarantees for the induced implicit regularization phenomenon.
arXiv Detail & Related papers (2020-07-16T13:27:47Z)
Accelerating Ill-Conditioned Low-Rank Matrix Estimation via Scaled Gradient Descent [34.0533596121548]
Low-rank matrix estimation converges convex problem that finds numerous applications in signal processing, machine learning and imaging science. We show that ScaledGD achieves the best of the best in terms of the number of the low-rank matrix. Our analysis is also applicable to general loss that are similar to low-rank gradient descent.
arXiv Detail & Related papers (2020-05-18T17:17:16Z)

This list is automatically generated from the titles and abstracts of the papers in this site.