Related papers: Learning to Discover Iterative Spectral Algorithms

Learning to Discover Iterative Spectral Algorithms

URL: http://arxiv.org/abs/2602.09530v1
Date: Tue, 10 Feb 2026 08:39:59 GMT
Title: Learning to Discover Iterative Spectral Algorithms
Authors: Zihang Liu, Oleg Balabanov, Yaoqing Yang, Michael W. Mahoney,
Abstract summary: AutoSpec is a neural network framework for discovering iterative spectral algorithms for large-scale numerical linear algebra.<n>We apply AutoSpec to discovering algorithms for representative numerical linear algebra tasks.<n>On real-world solvers, the learned procedures deliver orders-of-magnitude improvements in accuracy and/or reductions in count.
Score: 46.39331984989963
License: http://creativecommons.org/licenses/by/4.0/
Abstract: We introduce AutoSpec, a neural network framework for discovering iterative spectral algorithms for large-scale numerical linear algebra and numerical optimization. Our self-supervised models adapt to input operators using coarse spectral information (e.g., eigenvalue estimates and residual norms), and they predict recurrence coefficients for computing or applying a matrix polynomial tailored to a downstream task. The effectiveness of AutoSpec relies on three ingredients: an architecture whose inference pass implements short, executable numerical linear algebra recurrences; efficient training on small synthetic problems with transfer to large-scale real-world operators; and task-defined objectives that enforce the desired approximation or preconditioning behavior across the range of spectral profiles represented in the training set. We apply AutoSpec to discovering algorithms for representative numerical linear algebra tasks: accelerating matrix-function approximation; accelerating sparse linear solvers; and spectral filtering/preconditioning for eigenvalue computations. On real-world matrices, the learned procedures deliver orders-of-magnitude improvements in accuracy and/or reductions in iteration count, relative to basic baselines. We also find clear connections to classical theory: the induced polynomials often exhibit near-equiripple, near-minimax behavior characteristic of Chebyshev polynomials.

Related papers

PRISM: Distribution-free Adaptive Computation of Matrix Functions for Accelerating Neural Network Training [47.80717552769429]
We present PRISM (Polynomial-fitting and Randomized Iterative Sketching for Matrix functions), a framework for accelerating computing algorithms for matrix functions.<n> PRISM combines adaptive approximation with randomized sketching: at each iteration, it fits a surrogate to the current spectrum via a sketched least-squares problem.<n>Unlike prior methods, PRISM requires no explicit spectral bounds or singular value estimates; and it adapts automatically to the evolving spectrum.
arXiv Detail & Related papers (2026-01-29T18:55:46Z)
Gradient Descent as a Perceptron Algorithm: Understanding Dynamics and Implicit Acceleration [67.12978375116599]
We show that the steps of gradient descent (GD) reduce to those of generalized perceptron algorithms.<n>This helps explain the optimization dynamics and the implicit acceleration phenomenon observed in neural networks.
arXiv Detail & Related papers (2025-12-12T14:16:35Z)
Spectral Estimators for Multi-Index Models: Precise Asymptotics and Optimal Weak Recovery [21.414505380263016]
We focus on recovering the subspace spanned by the signals via spectral estimators.<n>Our main technical contribution is a precise characterization of the performance of spectral methods.<n>Our analysis unveils a phase transition phenomenon in which, as the sample complexity grows, eigenvalues escape from the bulk of the spectrum.
arXiv Detail & Related papers (2025-02-03T18:08:30Z)
Learning Linear Attention in Polynomial Time [127.14106124669645]
We provide the first results on learnability of single-layer Transformers with linear attention.<n>We show that linear attention may be viewed as a linear predictor in a suitably defined RKHS.<n>We show how to efficiently identify training datasets for which every empirical riskr is equivalent to the linear Transformer.
arXiv Detail & Related papers (2024-10-14T02:41:01Z)
Learning incomplete factorization preconditioners for GMRES [1.1519724914285523]
We train a graph neural network to approximate the matrix factorization directly.<n>Applying a graph neural network architecture allows us to ensure that the output itself is sparse.<n>We show their effectiveness in decreasing the number of GMRES iterations and improving the spectral properties on synthetic data.
arXiv Detail & Related papers (2024-09-12T17:55:44Z)
Operator SVD with Neural Networks via Nested Low-Rank Approximation [19.562492156734653]
This paper proposes a new optimization framework based on the low-rank approximation characterization of a truncated singular value decomposition. New techniques called emphnesting for learning the top-$L$ singular values and singular functions in the correct order. We demonstrate the effectiveness of the proposed framework for use cases in computational physics and machine learning.
arXiv Detail & Related papers (2024-02-06T03:06:06Z)
CoLA: Exploiting Compositional Structure for Automatic and Efficient Numerical Linear Algebra [62.37017125812101]
We propose a simple but general framework for large-scale linear algebra problems in machine learning, named CoLA. By combining a linear operator abstraction with compositional dispatch rules, CoLA automatically constructs memory and runtime efficient numerical algorithms. We showcase its efficacy across a broad range of applications, including partial differential equations, Gaussian processes, equivariant model construction, and unsupervised learning.
arXiv Detail & Related papers (2023-09-06T14:59:38Z)
Regularization, early-stopping and dreaming: a Hopfield-like setup to address generalization and overfitting [0.0]
We look for optimal network parameters by applying a gradient descent over a regularized loss function. Within this framework, the optimal neuron-interaction matrices correspond to Hebbian kernels revised by a reiterated unlearning protocol.
arXiv Detail & Related papers (2023-08-01T15:04:30Z)
Performance Embeddings: A Similarity-based Approach to Automatic Performance Optimization [71.69092462147292]
Performance embeddings enable knowledge transfer of performance tuning between applications. We demonstrate this transfer tuning approach on case studies in deep neural networks, dense and sparse linear algebra compositions, and numerical weather prediction stencils.
arXiv Detail & Related papers (2023-03-14T15:51:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.