Related papers: Algebraformer: A Neural Approach to Linear Systems

Algebraformer: A Neural Approach to Linear Systems

URL: http://arxiv.org/abs/2511.14263v1
Date: Tue, 18 Nov 2025 08:53:22 GMT
Title: Algebraformer: A Neural Approach to Linear Systems
Authors: Pietro Sittoni, Francesco Tudisco,
Abstract summary: We investigate the fundamental task of solving linear systems, particularly those that are ill-conditioned.<n>Existing numerical methods for ill-conditioned systems often require careful parameter tuning, preconditioning, or domain-specific expertise to ensure accuracy and stability.<n>We propose Algebraformer, a Transformer-based architecture that learns to solve linear systems end-to-end, even in the presence of severe ill-conditioning.
Score: 10.284321066857151
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Recent work in deep learning has opened new possibilities for solving classical algorithmic tasks using end-to-end learned models. In this work, we investigate the fundamental task of solving linear systems, particularly those that are ill-conditioned. Existing numerical methods for ill-conditioned systems often require careful parameter tuning, preconditioning, or domain-specific expertise to ensure accuracy and stability. In this work, we propose Algebraformer, a Transformer-based architecture that learns to solve linear systems end-to-end, even in the presence of severe ill-conditioning. Our model leverages a novel encoding scheme that enables efficient representation of matrix and vector inputs, with a memory complexity of $O(n^2)$, supporting scalable inference. We demonstrate its effectiveness on application-driven linear problems, including interpolation tasks from spectral methods for boundary value problems and acceleration of the Newton method. Algebraformer achieves competitive accuracy with significantly lower computational overhead at test time, demonstrating that general-purpose neural architectures can effectively reduce complexity in traditional scientific computing pipelines.

Related papers

Revisiting Orbital Minimization Method for Neural Operator Decomposition [19.86950069790711]
We revisit a classical optimization framework known as the emphorbital method (OMM) originally proposed in the 1990s for solving eigenvalue problems in computational chemistry.<n>We adapt this framework to train neural networks to decompose positive semidefinite operators, and demonstrate its practical advantages across a range of benchmark tasks.
arXiv Detail & Related papers (2025-10-24T18:26:18Z)
Learning from Linear Algebra: A Graph Neural Network Approach to Preconditioner Design for Conjugate Gradient Solvers [40.6591136324878]
We train GNNs to obtain preconditioners that reduce the condition number of the system more significantly than classical preconditioners.<n>Our approach outperforms both classical and neural network-based methods for an important class of parametric partial differential equations.
arXiv Detail & Related papers (2024-05-24T13:44:30Z)
A Neural Rewriting System to Solve Algorithmic Problems [47.129504708849446]
We propose a modular architecture designed to learn a general procedure for solving nested mathematical formulas. Inspired by rewriting systems, a classic framework in symbolic artificial intelligence, we include in the architecture three specialized and interacting modules. We benchmark our system against the Neural Data Router, a recent model specialized for systematic generalization, and a state-of-the-art large language model (GPT-4) probed with advanced prompting strategies.
arXiv Detail & Related papers (2024-02-27T10:57:07Z)
Mechanistic Neural Networks for Scientific Machine Learning [58.99592521721158]
We present Mechanistic Neural Networks, a neural network design for machine learning applications in the sciences. It incorporates a new Mechanistic Block in standard architectures to explicitly learn governing differential equations as representations. Central to our approach is a novel Relaxed Linear Programming solver (NeuRLP) inspired by a technique that reduces solving linear ODEs to solving linear programs.
arXiv Detail & Related papers (2024-02-20T15:23:24Z)
Differentiable Visual Computing for Inverse Problems and Machine Learning [27.45555082573493]
Visual computing methods are used to analyze geometry, physically simulate solids, fluids, and other media, and render the world via optical techniques. Deep learning (DL) allows for the construction of general algorithmic models, side stepping the need for a purely first principles-based approach to problem solving. DL is powered by highly parameterized neural network architectures -- universal function approximators -- and gradient-based search algorithms.
arXiv Detail & Related papers (2023-11-21T23:02:58Z)
Neural incomplete factorization: learning preconditioners for the conjugate gradient method [2.899792823251184]
We develop a data-driven approach to accelerate the generation of effective preconditioners. We replace the typically hand-engineered preconditioners by the output of graph neural networks. Our method generates an incomplete factorization of the matrix and is, therefore, referred to as neural incomplete factorization (NeuralIF)
arXiv Detail & Related papers (2023-05-25T11:45:46Z)
Mixed formulation of physics-informed neural networks for thermo-mechanically coupled systems and heterogeneous domains [0.0]
Physics-informed neural networks (PINNs) are a new tool for solving boundary value problems. Recent investigations have shown that when designing loss functions for many engineering problems, using first-order derivatives and combining equations from both strong and weak forms can lead to much better accuracy. In this work, we propose applying the mixed formulation to solve multi-physical problems, specifically a stationary thermo-mechanically coupled system of equations.
arXiv Detail & Related papers (2023-02-09T21:56:59Z)
AttNS: Attention-Inspired Numerical Solving For Limited Data Scenarios [51.94807626839365]
We propose the attention-inspired numerical solver (AttNS) to solve differential equations due to limited data.<n>AttNS is inspired by the effectiveness of attention modules in Residual Neural Networks (ResNet) in enhancing model generalization and robustness.
arXiv Detail & Related papers (2023-02-05T01:39:21Z)
On Fast Simulation of Dynamical System with Neural Vector Enhanced Numerical Solver [59.13397937903832]
We introduce a deep learning-based corrector called Neural Vector (NeurVec) NeurVec can compensate for integration errors and enable larger time step sizes in simulations. Our experiments on a variety of complex dynamical system benchmarks demonstrate that NeurVec exhibits remarkable generalization capability.
arXiv Detail & Related papers (2022-08-07T09:02:18Z)
Stabilizing Q-learning with Linear Architectures for Provably Efficient Learning [53.17258888552998]
This work proposes an exploration variant of the basic $Q$-learning protocol with linear function approximation. We show that the performance of the algorithm degrades very gracefully under a novel and more permissive notion of approximation error.
arXiv Detail & Related papers (2022-06-01T23:26:51Z)
A Deep Gradient Correction Method for Iteratively Solving Linear Systems [5.744903762364991]
We present a novel approach to approximate the solution of large, sparse, symmetric, positive-definite linear systems of equations. Our algorithm is capable of reducing the linear system residual to a given tolerance in a small number of iterations.
arXiv Detail & Related papers (2022-05-22T06:40:38Z)
Message Passing Neural PDE Solvers [60.77761603258397]
We build a neural message passing solver, replacing allally designed components in the graph with backprop-optimized neural function approximators. We show that neural message passing solvers representationally contain some classical methods, such as finite differences, finite volumes, and WENO schemes. We validate our method on various fluid-like flow problems, demonstrating fast, stable, and accurate performance across different domain topologies, equation parameters, discretizations, etc., in 1D and 2D.
arXiv Detail & Related papers (2022-02-07T17:47:46Z)
Supervised DKRC with Images for Offline System Identification [77.34726150561087]
Modern dynamical systems are becoming increasingly non-linear and complex. There is a need for a framework to model these systems in a compact and comprehensive representation for prediction and control. Our approach learns these basis functions using a supervised learning approach.
arXiv Detail & Related papers (2021-09-06T04:39:06Z)
Model-Based Deep Learning [155.063817656602]
Signal processing, communications, and control have traditionally relied on classical statistical modeling techniques. Deep neural networks (DNNs) use generic architectures which learn to operate from data, and demonstrate excellent performance. We are interested in hybrid techniques that combine principled mathematical models with data-driven systems to benefit from the advantages of both approaches.
arXiv Detail & Related papers (2020-12-15T16:29:49Z)
Quantum-Inspired Algorithms from Randomized Numerical Linear Algebra [53.46106569419296]
We create classical (non-quantum) dynamic data structures supporting queries for recommender systems and least-squares regression. We argue that the previous quantum-inspired algorithms for these problems are doing leverage or ridge-leverage score sampling in disguise.
arXiv Detail & Related papers (2020-11-09T01:13:07Z)

This list is automatically generated from the titles and abstracts of the papers in this site.