Related papers: Fast Convex Quadratic Optimization Solvers with Adaptive Sketching-based Preconditioners

Fast Convex Quadratic Optimization Solvers with Adaptive Sketching-based Preconditioners

URL: http://arxiv.org/abs/2104.14101v1
Date: Thu, 29 Apr 2021 04:36:41 GMT
Title: Fast Convex Quadratic Optimization Solvers with Adaptive Sketching-based Preconditioners
Authors: Jonathan Lacotte and Mert Pilanci
Abstract summary: We consider least-squares problems with quadratic regularization and propose novel sketching-based iterative methods with an adaptive sketch size. The sketch size can be as small as the effective dimension of the data matrix to guarantee linear convergence. A major difficulty in choosing the sketch size in terms of the effective dimension lies in the fact that the latter is usually unknown in practice.
Score: 37.03247707259297
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We consider least-squares problems with quadratic regularization and propose novel sketching-based iterative methods with an adaptive sketch size. The sketch size can be as small as the effective dimension of the data matrix to guarantee linear convergence. However, a major difficulty in choosing the sketch size in terms of the effective dimension lies in the fact that the latter is usually unknown in practice. Current sketching-based solvers for regularized least-squares fall short on addressing this issue. Our main contribution is to propose adaptive versions of standard sketching-based iterative solvers, namely, the iterative Hessian sketch and the preconditioned conjugate gradient method, that do not require a priori estimation of the effective dimension. We propose an adaptive mechanism to control the sketch size according to the progress made in each step of the iterative solver. If enough progress is not made, the sketch size increases to improve the convergence rate. We prove that the adaptive sketch size scales at most in terms of the effective dimension, and that our adaptive methods are guaranteed to converge linearly. Consequently, our adaptive methods improve the state-of-the-art complexity for solving dense, ill-conditioned least-squares problems. Importantly, we illustrate numerically on several synthetic and real datasets that our method is extremely efficient and is often significantly faster than standard least-squares solvers such as a direct factorization based solver, the conjugate gradient method and its preconditioned variants.

Related papers

What is a Sketch-and-Precondition Derivation for Low-Rank Approximation? Inverse Power Error or Inverse Power Estimation? [3.817486368952521]
We develop a sketch-and-precondition framework for randomized algorithms in low- rank matrix approximation. Our method achieves theoretical guarantees, including a convergence rate that improves at least linearly with the sketch size.
arXiv Detail & Related papers (2025-02-11T22:19:56Z)
Distributed Least Squares in Small Space via Sketching and Bias Reduction [0.0]
Matrix sketching is a powerful tool for reducing the size of large data matrices. We show that these limitations can be circumvented in the distributed setting by designing sketching methods that minimize the bias of the estimator, rather than its error. In particular, we give a sparse sketching method running in optimal space and current matrix multiplication time, which recovers a nearly-unbiased least squares estimator using two passes over the data.
arXiv Detail & Related papers (2024-05-08T18:16:37Z)
Sharp Analysis of Sketch-and-Project Methods via a Connection to Randomized Singular Value Decomposition [14.453949553412821]
We develop a theoretical framework for obtaining sharp guarantees on the convergence rate of sketch-and-project methods. We show that the convergence rate improves at least linearly with the sketch size, and even faster when the data matrix exhibits certain spectral decays. Our experiments support the theory and demonstrate that even extremely sparse sketches exhibit the convergence properties predicted by our framework.
arXiv Detail & Related papers (2022-08-20T03:11:13Z)
Distributed Sketching for Randomized Optimization: Exact Characterization, Concentration and Lower Bounds [54.51566432934556]
We consider distributed optimization methods for problems where forming the Hessian is computationally challenging. We leverage randomized sketches for reducing the problem dimensions as well as preserving privacy and improving straggler resilience in asynchronous distributed systems.
arXiv Detail & Related papers (2022-03-18T05:49:13Z)
Adaptive Newton Sketch: Linear-time Optimization with Quadratic Convergence and Effective Hessian Dimensionality [32.292481678592544]
We propose a randomized algorithm with quadratic convergence rate for convex optimization problems with a self-concordant, composite, strongly convex objective function. Our first contribution is to show that, at each iteration, the embedding dimension can be as small as the effective dimension of the Hessian matrix. This result dramatically improves on the classical linear-quadratic convergence rates of state-of-the-art sub-sampled Newton methods.
arXiv Detail & Related papers (2021-05-15T20:24:26Z)
Conditional gradient methods for stochastically constrained convex minimization [54.53786593679331]
We propose two novel conditional gradient-based methods for solving structured convex optimization problems. The most important feature of our framework is that only a subset of the constraints is processed at each iteration. Our algorithms rely on variance reduction and smoothing used in conjunction with conditional gradient steps, and are accompanied by rigorous convergence guarantees.
arXiv Detail & Related papers (2020-07-07T21:26:35Z)
Balancing Rates and Variance via Adaptive Batch-Size for Stochastic Optimization Problems [120.21685755278509]
In this work, we seek to balance the fact that attenuating step-size is required for exact convergence with the fact that constant step-size learns faster in time up to an error. Rather than fixing the minibatch the step-size at the outset, we propose to allow parameters to evolve adaptively.
arXiv Detail & Related papers (2020-07-02T16:02:02Z)
Cogradient Descent for Bilinear Optimization [124.45816011848096]
We introduce a Cogradient Descent algorithm (CoGD) to address the bilinear problem. We solve one variable by considering its coupling relationship with the other, leading to a synchronous gradient descent. Our algorithm is applied to solve problems with one variable under the sparsity constraint.
arXiv Detail & Related papers (2020-06-16T13:41:54Z)
Effective Dimension Adaptive Sketching Methods for Faster Regularized Least-Squares Optimization [56.05635751529922]
We propose a new randomized algorithm for solving L2-regularized least-squares problems based on sketching. We consider two of the most popular random embeddings, namely, Gaussian embeddings and the Subsampled Randomized Hadamard Transform (SRHT)
arXiv Detail & Related papers (2020-06-10T15:00:09Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.