Related papers: Effective Dimension Adaptive Sketching Methods for Faster Regularized Least-Squares Optimization

Effective Dimension Adaptive Sketching Methods for Faster Regularized Least-Squares Optimization

URL: http://arxiv.org/abs/2006.05874v2
Date: Fri, 23 Oct 2020 12:05:16 GMT
Title: Effective Dimension Adaptive Sketching Methods for Faster Regularized Least-Squares Optimization
Authors: Jonathan Lacotte and Mert Pilanci
Abstract summary: We propose a new randomized algorithm for solving L2-regularized least-squares problems based on sketching. We consider two of the most popular random embeddings, namely, Gaussian embeddings and the Subsampled Randomized Hadamard Transform (SRHT)
Score: 56.05635751529922
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We propose a new randomized algorithm for solving L2-regularized least-squares problems based on sketching. We consider two of the most popular random embeddings, namely, Gaussian embeddings and the Subsampled Randomized Hadamard Transform (SRHT). While current randomized solvers for least-squares optimization prescribe an embedding dimension at least greater than the data dimension, we show that the embedding dimension can be reduced to the effective dimension of the optimization problem, and still preserve high-probability convergence guarantees. In this regard, we derive sharp matrix deviation inequalities over ellipsoids for both Gaussian and SRHT embeddings. Specifically, we improve on the constant of a classical Gaussian concentration bound whereas, for SRHT embeddings, our deviation inequality involves a novel technical approach. Leveraging these bounds, we are able to design a practical and adaptive algorithm which does not require to know the effective dimension beforehand. Our method starts with an initial embedding dimension equal to 1 and, over iterations, increases the embedding dimension up to the effective one at most. Hence, our algorithm improves the state-of-the-art computational complexity for solving regularized least-squares problems. Further, we show numerically that it outperforms standard iterative solvers such as the conjugate gradient method and its pre-conditioned version on several standard machine learning datasets.

Related papers

Fast Screening Rules for Optimal Design via Quadratic Lasso Reformulation [0.135975510645475]
In this work, we derive safe screening rules that can be used to discard inessential samples. The new tests are much faster to compute, especially for problems involving a parameter space of high dimension. We show how an existing homotopy algorithm to compute the regularization path of the lasso method can be reparametrized with respect to the squared $ell_$-penalty.
arXiv Detail & Related papers (2023-10-13T08:10:46Z)
Accelerating Cutting-Plane Algorithms via Reinforcement Learning Surrogates [49.84541884653309]
A current standard approach to solving convex discrete optimization problems is the use of cutting-plane algorithms. Despite the existence of a number of general-purpose cut-generating algorithms, large-scale discrete optimization problems continue to suffer from intractability. We propose a method for accelerating cutting-plane algorithms via reinforcement learning.
arXiv Detail & Related papers (2023-07-17T20:11:56Z)
Fast Computation of Optimal Transport via Entropy-Regularized Extragradient Methods [75.34939761152587]
Efficient computation of the optimal transport distance between two distributions serves as an algorithm that empowers various applications. This paper develops a scalable first-order optimization-based method that computes optimal transport to within $varepsilon$ additive accuracy.
arXiv Detail & Related papers (2023-01-30T15:46:39Z)
Adaptive Stochastic Optimisation of Nonconvex Composite Objectives [2.1700203922407493]
We propose and analyse a family of generalised composite mirror descent algorithms. With adaptive step sizes, the proposed algorithms converge without requiring prior knowledge of the problem. We exploit the low-dimensional structure of the decision sets for high-dimensional problems.
arXiv Detail & Related papers (2022-11-21T18:31:43Z)
Adaptive Newton Sketch: Linear-time Optimization with Quadratic Convergence and Effective Hessian Dimensionality [32.292481678592544]
We propose a randomized algorithm with quadratic convergence rate for convex optimization problems with a self-concordant, composite, strongly convex objective function. Our first contribution is to show that, at each iteration, the embedding dimension can be as small as the effective dimension of the Hessian matrix. This result dramatically improves on the classical linear-quadratic convergence rates of state-of-the-art sub-sampled Newton methods.
arXiv Detail & Related papers (2021-05-15T20:24:26Z)
Fast Convex Quadratic Optimization Solvers with Adaptive Sketching-based Preconditioners [37.03247707259297]
We consider least-squares problems with quadratic regularization and propose novel sketching-based iterative methods with an adaptive sketch size. The sketch size can be as small as the effective dimension of the data matrix to guarantee linear convergence. A major difficulty in choosing the sketch size in terms of the effective dimension lies in the fact that the latter is usually unknown in practice.
arXiv Detail & Related papers (2021-04-29T04:36:41Z)
Convergence of adaptive algorithms for weakly convex constrained optimization [59.36386973876765]
We prove the $mathcaltilde O(t-1/4)$ rate of convergence for the norm of the gradient of Moreau envelope. Our analysis works with mini-batch size of $1$, constant first and second order moment parameters, and possibly smooth optimization domains.
arXiv Detail & Related papers (2020-06-11T17:43:19Z)
Optimal Randomized First-Order Methods for Least-Squares Problems [56.05635751529922]
This class of algorithms encompasses several randomized methods among the fastest solvers for least-squares problems. We focus on two classical embeddings, namely, Gaussian projections and subsampled Hadamard transforms. Our resulting algorithm yields the best complexity known for solving least-squares problems with no condition number dependence.
arXiv Detail & Related papers (2020-02-21T17:45:32Z)
Optimal Iterative Sketching with the Subsampled Randomized Hadamard Transform [64.90148466525754]
We study the performance of iterative sketching for least-squares problems. We show that the convergence rate for Haar and randomized Hadamard matrices are identical, andally improve upon random projections. These techniques may be applied to other algorithms that employ randomized dimension reduction.
arXiv Detail & Related papers (2020-02-03T16:17:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.