Randomized Kaczmarz Methods with Beyond-Krylov Convergence
- URL: http://arxiv.org/abs/2501.11673v1
- Date: Mon, 20 Jan 2025 18:55:51 GMT
- Title: Randomized Kaczmarz Methods with Beyond-Krylov Convergence
- Authors: Michał Dereziński, Deanna Needell, Elizaveta Rebrova, Jiaming Yang,
- Abstract summary: We introduce Kaczmarz++, an accelerated randomized block Kaczmarz algorithm that exploits outlying singular values in the input to attain a fast Krylov-style convergence.
We show that Kaczmarz++ captures large outlying singular values provably faster than popular Krylov methods, for both over- and under-determined systems.
- Score: 8.688801614519988
- License:
- Abstract: Randomized Kaczmarz methods form a family of linear system solvers which converge by repeatedly projecting their iterates onto randomly sampled equations. While effective in some contexts, such as highly over-determined least squares, Kaczmarz methods are traditionally deemed secondary to Krylov subspace methods, since this latter family of solvers can exploit outliers in the input's singular value distribution to attain fast convergence on ill-conditioned systems. In this paper, we introduce Kaczmarz++, an accelerated randomized block Kaczmarz algorithm that exploits outlying singular values in the input to attain a fast Krylov-style convergence. Moreover, we show that Kaczmarz++ captures large outlying singular values provably faster than popular Krylov methods, for both over- and under-determined systems. We also develop an optimized variant for positive semidefinite systems, called CD++, demonstrating empirically that it is competitive in arithmetic operations with both CG and GMRES on a collection of benchmark problems. To attain these results, we introduce several novel algorithmic improvements to the Kaczmarz framework, including adaptive momentum acceleration, Tikhonov-regularized projections, and a memoization scheme for reusing information from previously sampled equation~blocks.
Related papers
- Linear Convergence of Reshuffling Kaczmarz Methods With Sparse
Constraints [7.936519714074615]
The Kaczmarz matrix (KZ) and its variants have been extensively studied due to their simplicity and efficiency in solving sub linear equation systems.
We provide the first theoretical convergence guarantees for KHT by showing that it converges linearly to the solution of a system with sparsity constraints.
arXiv Detail & Related papers (2023-04-20T07:14:24Z) - Bregman Power k-Means for Clustering Exponential Family Data [11.434503492579477]
We bridge algorithmic advances to classical work on hard clustering under Bregman divergences.
The elegant properties of Bregman divergences allow us to maintain closed form updates in a simple and transparent algorithm.
We consider thorough empirical analyses on simulated experiments and a case study on rainfall data, finding that the proposed method outperforms existing peer methods in a variety of non-Gaussian data settings.
arXiv Detail & Related papers (2022-06-22T06:09:54Z) - Distributed Dynamic Safe Screening Algorithms for Sparse Regularization [73.85961005970222]
We propose a new distributed dynamic safe screening (DDSS) method for sparsity regularized models and apply it on shared-memory and distributed-memory architecture respectively.
We prove that the proposed method achieves the linear convergence rate with lower overall complexity and can eliminate almost all the inactive features in a finite number of iterations almost surely.
arXiv Detail & Related papers (2022-04-23T02:45:55Z) - Faster One-Sample Stochastic Conditional Gradient Method for Composite
Convex Minimization [61.26619639722804]
We propose a conditional gradient method (CGM) for minimizing convex finite-sum objectives formed as a sum of smooth and non-smooth terms.
The proposed method, equipped with an average gradient (SAG) estimator, requires only one sample per iteration. Nevertheless, it guarantees fast convergence rates on par with more sophisticated variance reduction techniques.
arXiv Detail & Related papers (2022-02-26T19:10:48Z) - Nesterov Accelerated ADMM for Fast Diffeomorphic Image Registration [63.15453821022452]
Recent developments in approaches based on deep learning have achieved sub-second runtimes for DiffIR.
We propose a simple iterative scheme that functionally composes intermediate non-stationary velocity fields.
We then propose a convex optimisation model that uses a regularisation term of arbitrary order to impose smoothness on these velocity fields.
arXiv Detail & Related papers (2021-09-26T19:56:45Z) - Acceleration Methods [57.202881673406324]
We first use quadratic optimization problems to introduce two key families of acceleration methods.
We discuss momentum methods in detail, starting with the seminal work of Nesterov.
We conclude by discussing restart schemes, a set of simple techniques for reaching nearly optimal convergence rates.
arXiv Detail & Related papers (2021-01-23T17:58:25Z) - Accelerated Message Passing for Entropy-Regularized MAP Inference [89.15658822319928]
Maximum a posteriori (MAP) inference in discrete-valued random fields is a fundamental problem in machine learning.
Due to the difficulty of this problem, linear programming (LP) relaxations are commonly used to derive specialized message passing algorithms.
We present randomized methods for accelerating these algorithms by leveraging techniques that underlie classical accelerated gradient.
arXiv Detail & Related papers (2020-07-01T18:43:32Z) - Connecting the Dots: Numerical Randomized Hamiltonian Monte Carlo with
State-Dependent Event Rates [0.0]
We introduce a robust, easy to use and computationally fast alternative to conventional Markov chain Monte Carlo methods for continuous target distributions.
The proposed algorithm may yield large speedups and improvements in stability relative to relevant benchmarks.
Granted access to a high-quality ODE code, the proposed methodology is both easy to implement and use, even for highly challenging and high-dimensional target distributions.
arXiv Detail & Related papers (2020-05-04T06:23:13Z) - Stochastic gradient algorithms from ODE splitting perspective [0.0]
We present a different view on optimization, which goes back to the splitting schemes for approximate solutions of ODE.
In this work, we provide a connection between descent approach and gradient first-order splitting scheme for ODE.
We consider the special case of splitting, which is inspired by machine learning applications and derive a new upper bound on the global splitting error for it.
arXiv Detail & Related papers (2020-04-19T22:45:32Z) - Optimal Randomized First-Order Methods for Least-Squares Problems [56.05635751529922]
This class of algorithms encompasses several randomized methods among the fastest solvers for least-squares problems.
We focus on two classical embeddings, namely, Gaussian projections and subsampled Hadamard transforms.
Our resulting algorithm yields the best complexity known for solving least-squares problems with no condition number dependence.
arXiv Detail & Related papers (2020-02-21T17:45:32Z) - Average-case Acceleration Through Spectral Density Estimation [35.01931431231649]
We develop a framework for the average-case analysis of random quadratic problems.
We derive algorithms that are optimal under this analysis.
We develop explicit algorithms for the uniform, Marchenko-Pastur, and exponential distributions.
arXiv Detail & Related papers (2020-02-12T01:44:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.