Approximating Matrix Functions with Deep Neural Networks and Transformers
- URL: http://arxiv.org/abs/2602.07800v1
- Date: Sun, 08 Feb 2026 03:45:25 GMT
- Title: Approximating Matrix Functions with Deep Neural Networks and Transformers
- Authors: Rahul Padmanabhan, Simone Brugiapaglia,
- Abstract summary: We study the approximation of matrix functions, which map scalar functions to matrices, using neural networks including transformers.<n>We show experimentally that a transformer encoder-decoder with suitable numerical encodings can approximate certain matrix functions at a relative error of 5% with high probability.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Transformers have revolutionized natural language processing, but their use for numerical computation has received less attention. We study the approximation of matrix functions, which map scalar functions to matrices, using neural networks including transformers. We focus on functions mapping square matrices to square matrices of the same dimension. These types of matrix functions appear throughout scientific computing, e.g., the matrix exponential in continuous-time Markov chains and the matrix sign function in stability analysis of dynamical systems. In this paper, we make two contributions. First, we prove bounds on the width and depth of ReLU networks needed to approximate the matrix exponential to an arbitrary precision. Second, we show experimentally that a transformer encoder-decoder with suitable numerical encodings can approximate certain matrix functions at a relative error of 5% with high probability. Our study reveals that the encoding scheme strongly affects performance, with different schemes working better for different functions.
Related papers
- NeuMatC: A General Neural Framework for Fast Parametric Matrix Operation [75.91285900600549]
We propose textbftextitNeural Matrix Computation Framework (NeuMatC), which elegantly tackles general parametric matrix operation tasks.<n>NeuMatC unsupervisedly learns a low-rank and continuous mapping from parameters to their corresponding matrix operation results.<n> Experimental results on both synthetic and real-world datasets demonstrate the promising performance of NeuMatC.
arXiv Detail & Related papers (2025-11-28T07:21:17Z) - The Ubiquitous Sparse Matrix-Matrix Products [0.0]
multiplication of a sparse matrix with another (dense or sparse) matrix is a fundamental operation that captures the computational patterns of many data science applications.<n>We provide a unifying treatment of the sparse matrix-matrix operation and its rich application space including machine learning, computational biology and chemistry, graph algorithms, and scientific computing.
arXiv Detail & Related papers (2025-08-06T04:26:52Z) - Quantum Eigensolver for Non-Normal Matrices via Ground State Energy Estimation [0.4511923587827302]
Large-scale eigenvalue problems pose a significant challenge to classical computers.<n>We propose a quantum algorithm that outputs an estimate of an eigenvalue to within additive error $epsilon$ with probability at least $1-p_rm fail$.<n>Our algorithm is the first general eigenvalue algorithm that achieves this scaling.
arXiv Detail & Related papers (2025-02-25T11:43:47Z) - Bauer's Spectral Factorization Method for Low Order Multiwavelet Filter
Design [0.6138671548064355]
We introduce a fast method for matrix spectral factorization based on Bauer$'$s method.
We convert Bauer's method into a nonlinear matrix equation (NME)
The NME is solved by two different numerical algorithms.
arXiv Detail & Related papers (2023-12-09T00:26:52Z) - Quantum algorithms for matrix operations and linear systems of equations [65.62256987706128]
We propose quantum algorithms for matrix operations using the "Sender-Receiver" model.
These quantum protocols can be used as subroutines in other quantum schemes.
arXiv Detail & Related papers (2022-02-10T08:12:20Z) - Fast Differentiable Matrix Square Root and Inverse Square Root [65.67315418971688]
We propose two more efficient variants to compute the differentiable matrix square root and the inverse square root.
For the forward propagation, one method is to use Matrix Taylor Polynomial (MTP), and the other method is to use Matrix Pad'e Approximants (MPA)
A series of numerical tests show that both methods yield considerable speed-up compared with the SVD or the NS iteration.
arXiv Detail & Related papers (2022-01-29T10:00:35Z) - Fast Differentiable Matrix Square Root [65.67315418971688]
We propose two more efficient variants to compute the differentiable matrix square root.
For the forward propagation, one method is to use Matrix Taylor Polynomial (MTP)
The other method is to use Matrix Pad'e Approximants (MPA)
arXiv Detail & Related papers (2022-01-21T12:18:06Z) - Sparse Factorization of Large Square Matrices [10.94053598642913]
In this paper, we propose to approximate a large square matrix with a product of sparse full-rank matrices.
In the approximation, our method needs only $N(log N)2$ non-zero numbers for an $Ntimes N$ full matrix.
We show that our method gives a better approximation when the approximated matrix is sparse and high-rank.
arXiv Detail & Related papers (2021-09-16T18:42:21Z) - Robust 1-bit Compressive Sensing with Partial Gaussian Circulant
Matrices and Generative Priors [54.936314353063494]
We provide recovery guarantees for a correlation-based optimization algorithm for robust 1-bit compressive sensing.
We make use of a practical iterative algorithm, and perform numerical experiments on image datasets to corroborate our results.
arXiv Detail & Related papers (2021-08-08T05:28:06Z) - Non-PSD Matrix Sketching with Applications to Regression and
Optimization [56.730993511802865]
We present dimensionality reduction methods for non-PSD and square-roots" matrices.
We show how these techniques can be used for multiple downstream tasks.
arXiv Detail & Related papers (2021-06-16T04:07:48Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.