Related papers: Fine-grained Analysis and Faster Algorithms for Iteratively Solving Linear Systems

Fine-grained Analysis and Faster Algorithms for Iteratively Solving Linear Systems

URL: http://arxiv.org/abs/2405.05818v1
Date: Thu, 9 May 2024 14:56:49 GMT
Title: Fine-grained Analysis and Faster Algorithms for Iteratively Solving Linear Systems
Authors: Michał Dereziński, Daniel LeJeune, Deanna Needell, Elizaveta Rebrova,
Abstract summary: We consider the spectral tail condition number, $kappa_ell$, defined as the ratio between the $ell$th largest and the smallest singular value of the matrix representing the system. Some of the implications of our result, and of the use of $kappa_ell$, include direct improvement over a fine-grained analysis of the Conjugate method.
Score: 9.30306458153248
License: http://creativecommons.org/licenses/by/4.0/
Abstract: While effective in practice, iterative methods for solving large systems of linear equations can be significantly affected by problem-dependent condition number quantities. This makes characterizing their time complexity challenging, particularly when we wish to make comparisons between deterministic and stochastic methods, that may or may not rely on preconditioning and/or fast matrix multiplication. In this work, we consider a fine-grained notion of complexity for iterative linear solvers which we call the spectral tail condition number, $\kappa_\ell$, defined as the ratio between the $\ell$th largest and the smallest singular value of the matrix representing the system. Concretely, we prove the following main algorithmic result: Given an $n\times n$ matrix $A$ and a vector $b$, we can find $\tilde{x}$ such that $\|A\tilde{x}-b\|\leq\epsilon\|b\|$ in time $\tilde{O}(\kappa_\ell\cdot n^2\log 1/\epsilon)$ for any $\ell = O(n^{\frac1{\omega-1}})=O(n^{0.729})$, where $\omega \approx 2.372$ is the current fast matrix multiplication exponent. This guarantee is achieved by Sketch-and-Project with Nesterov's acceleration. Some of the implications of our result, and of the use of $\kappa_\ell$, include direct improvement over a fine-grained analysis of the Conjugate Gradient method, suggesting a stronger separation between deterministic and stochastic iterative solvers; and relating the complexity of iterative solvers to the ongoing algorithmic advances in fast matrix multiplication, since the bound on $\ell$ improves with $\omega$. Our main technical contributions are new sharp characterizations for the first and second moments of the random projection matrix that commonly arises in sketching algorithms, building on a combination of techniques from combinatorial sampling via determinantal point processes and Gaussian universality results from random matrix theory.

Related papers

Approaching Optimality for Solving Dense Linear Systems with Low-Rank Structure [16.324043075920564]
We provide new high-accuracy randomized algorithms for solving linear systems and regression problems.<n>Our algorithms nearly-match a natural complexity limit under dense inputs for these problems.<n>We show how to obtain these running times even under the weaker assumption that all but $k$ of the singular values have a bounded generalized mean.
arXiv Detail & Related papers (2025-07-15T20:48:30Z)
Quantum linear system algorithm with optimal queries to initial state preparation [0.0]
Quantum algorithms for linear systems produce the solution state $A-1|brangle$ by querying two oracles. We present a quantum linear system algorithm making $mathbfThetaleft (1/sqrtpright)$ queries to $O_b$, which is optimal in the success probability. In various applications such as solving differential equations, we can further improve the dependence on $p$ using a block preconditioning scheme.
arXiv Detail & Related papers (2024-10-23T18:00:01Z)
Combinatorial optimization of the coefficient of determination [0.0]
We develop an efficient algorithm for selecting the $k$- subset of $n$ points in the plane with the highest coefficient of determination. We experimentally demonstrate our method's optimality over several million trials up to $n=30$ without error.
arXiv Detail & Related papers (2024-10-12T00:53:25Z)
Faster Sampling from Log-Concave Densities over Polytopes via Efficient Linear Solvers [29.212403229351253]
We present a nearly-optimal implementation of this Markov chain with per-step complexity which is roughly the number of non-zero entries of $A$ while the number of Markov chain steps remains the same. Key technical ingredients are 1) to show that the matrices that arise in this Dikin walk change slowly, 2) to deploy efficient linear solvers that can leverage this slow change to speed up matrix inversion by using information computed in previous steps, and 3) to speed up the computation of the determinantal term in the Metropolis filter step via a randomized Taylor series-based estimator.
arXiv Detail & Related papers (2024-09-06T14:49:43Z)
Optimal Sketching for Residual Error Estimation for Matrix and Vector Norms [50.15964512954274]
We study the problem of residual error estimation for matrix and vector norms using a linear sketch. We demonstrate that this gives a substantial advantage empirically, for roughly the same sketch size and accuracy as in previous work. We also show an $Omega(k2/pn1-2/p)$ lower bound for the sparse recovery problem, which is tight up to a $mathrmpoly(log n)$ factor.
arXiv Detail & Related papers (2024-08-16T02:33:07Z)
Faster Linear Systems and Matrix Norm Approximation via Multi-level Sketched Preconditioning [10.690769339903941]
We present a new class of preconditioned iterative methods for solving linear systems of the form $Ax = b$.<n>Our methods are based on constructing a low-rank Nystr"om approximation to $A$ using sparse random sketching matrix.<n>We prove that the convergence of our methods depends on a natural average condition number of $A$, which improves as the rank of the Nystr"om approximation increases.
arXiv Detail & Related papers (2024-05-09T15:53:43Z)
Solving Dense Linear Systems Faster Than via Preconditioning [1.8854491183340518]
We show that our algorithm has an $tilde O(n2)$ when $k=O(n0.729)$. In particular, our algorithm has an $tilde O(n2)$ when $k=O(n0.729)$. Our main algorithm can be viewed as a randomized block coordinate descent method.
arXiv Detail & Related papers (2023-12-14T12:53:34Z)
Fast Minimization of Expected Logarithmic Loss via Stochastic Dual Averaging [8.990961435218544]
We propose a first-order algorithm named $B$-sample dual averaging with the logarithmic barrier. For the Poisson inverse problem, our algorithm attains an $varepsilon$ solution in $smashtildeO(d3/varepsilon2)$ time. When computing the maximum-likelihood estimate for quantum state tomography, our algorithm yields an $varepsilon$-optimal solution in $smashtildeO(d3/varepsilon2)$ time.
arXiv Detail & Related papers (2023-11-05T03:33:44Z)
Efficiently Learning One-Hidden-Layer ReLU Networks via Schur Polynomials [50.90125395570797]
We study the problem of PAC learning a linear combination of $k$ ReLU activations under the standard Gaussian distribution on $mathbbRd$ with respect to the square loss. Our main result is an efficient algorithm for this learning task with sample and computational complexity $(dk/epsilon)O(k)$, whereepsilon>0$ is the target accuracy.
arXiv Detail & Related papers (2023-07-24T14:37:22Z)
Fast and Practical Quantum-Inspired Classical Algorithms for Solving Linear Systems [11.929584800629673]
We propose fast and practical quantum-inspired classical algorithms for solving linear systems. Our main contribution is the application of the heavy ball momentum method to quantum-inspired classical algorithms for solving linear systems.
arXiv Detail & Related papers (2023-07-13T08:46:19Z)
Clustering Mixture Models in Almost-Linear Time via List-Decodable Mean Estimation [58.24280149662003]
We study the problem of list-decodable mean estimation, where an adversary can corrupt a majority of the dataset. We develop new algorithms for list-decodable mean estimation, achieving nearly-optimal statistical guarantees.
arXiv Detail & Related papers (2021-06-16T03:34:14Z)
Linear Bandit Algorithms with Sublinear Time Complexity [67.21046514005029]
We propose to accelerate existing linear bandit algorithms to achieve per-step time complexity sublinear in the number of arms $K$. We show that our proposed algorithms can achieve $O(K1-alpha(T))$ per-step complexity for some $alpha(T) > 0$ and $widetilde O(stT)$ regret, where $T$ is the time horizon.
arXiv Detail & Related papers (2021-03-03T22:42:15Z)
Quantum algorithms for spectral sums [50.045011844765185]
We propose new quantum algorithms for estimating spectral sums of positive semi-definite (PSD) matrices. We show how the algorithms and techniques used in this work can be applied to three problems in spectral graph theory.
arXiv Detail & Related papers (2020-11-12T16:29:45Z)
Linear-Sample Learning of Low-Rank Distributions [56.59844655107251]
We show that learning $ktimes k$, rank-$r$, matrices to normalized $L_1$ distance requires $Omega(frackrepsilon2)$ samples. We propose an algorithm that uses $cal O(frackrepsilon2log2fracepsilon)$ samples, a number linear in the high dimension, and nearly linear in the matrices, typically low, rank proofs.
arXiv Detail & Related papers (2020-09-30T19:10:32Z)
Optimal Robust Linear Regression in Nearly Linear Time [97.11565882347772]
We study the problem of high-dimensional robust linear regression where a learner is given access to $n$ samples from the generative model $Y = langle X,w* rangle + epsilon$ We propose estimators for this problem under two settings: (i) $X$ is L4-L2 hypercontractive, $mathbbE [XXtop]$ has bounded condition number and $epsilon$ has bounded variance and (ii) $X$ is sub-Gaussian with identity second moment and $epsilon$ is
arXiv Detail & Related papers (2020-07-16T06:44:44Z)
Learning nonlinear dynamical systems from a single trajectory [102.60042167341956]
We introduce algorithms for learning nonlinear dynamical systems of the form $x_t+1=sigma(Thetastarx_t)+varepsilon_t$. We give an algorithm that recovers the weight matrix $Thetastar$ from a single trajectory with optimal sample complexity and linear running time.
arXiv Detail & Related papers (2020-04-30T10:42:48Z)

This list is automatically generated from the titles and abstracts of the papers in this site.