Effective Dimension Adaptive Sketching Methods for Faster Regularized
Least-Squares Optimization
- URL: http://arxiv.org/abs/2006.05874v2
- Date: Fri, 23 Oct 2020 12:05:16 GMT
- Title: Effective Dimension Adaptive Sketching Methods for Faster Regularized
Least-Squares Optimization
- Authors: Jonathan Lacotte and Mert Pilanci
- Abstract summary: We propose a new randomized algorithm for solving L2-regularized least-squares problems based on sketching.
We consider two of the most popular random embeddings, namely, Gaussian embeddings and the Subsampled Randomized Hadamard Transform (SRHT)
- Score: 56.05635751529922
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a new randomized algorithm for solving L2-regularized
least-squares problems based on sketching. We consider two of the most popular
random embeddings, namely, Gaussian embeddings and the Subsampled Randomized
Hadamard Transform (SRHT). While current randomized solvers for least-squares
optimization prescribe an embedding dimension at least greater than the data
dimension, we show that the embedding dimension can be reduced to the effective
dimension of the optimization problem, and still preserve high-probability
convergence guarantees. In this regard, we derive sharp matrix deviation
inequalities over ellipsoids for both Gaussian and SRHT embeddings.
Specifically, we improve on the constant of a classical Gaussian concentration
bound whereas, for SRHT embeddings, our deviation inequality involves a novel
technical approach. Leveraging these bounds, we are able to design a practical
and adaptive algorithm which does not require to know the effective dimension
beforehand. Our method starts with an initial embedding dimension equal to 1
and, over iterations, increases the embedding dimension up to the effective one
at most. Hence, our algorithm improves the state-of-the-art computational
complexity for solving regularized least-squares problems. Further, we show
numerically that it outperforms standard iterative solvers such as the
conjugate gradient method and its pre-conditioned version on several standard
machine learning datasets.
Related papers
- Fast Screening Rules for Optimal Design via Quadratic Lasso
Reformulation [0.135975510645475]
In this work, we derive safe screening rules that can be used to discard inessential samples.
The new tests are much faster to compute, especially for problems involving a parameter space of high dimension.
We show how an existing homotopy algorithm to compute the regularization path of the lasso method can be reparametrized with respect to the squared $ell_$-penalty.
arXiv Detail & Related papers (2023-10-13T08:10:46Z) - Accelerating Cutting-Plane Algorithms via Reinforcement Learning
Surrogates [49.84541884653309]
A current standard approach to solving convex discrete optimization problems is the use of cutting-plane algorithms.
Despite the existence of a number of general-purpose cut-generating algorithms, large-scale discrete optimization problems continue to suffer from intractability.
We propose a method for accelerating cutting-plane algorithms via reinforcement learning.
arXiv Detail & Related papers (2023-07-17T20:11:56Z) - Fast Computation of Optimal Transport via Entropy-Regularized Extragradient Methods [75.34939761152587]
Efficient computation of the optimal transport distance between two distributions serves as an algorithm that empowers various applications.
This paper develops a scalable first-order optimization-based method that computes optimal transport to within $varepsilon$ additive accuracy.
arXiv Detail & Related papers (2023-01-30T15:46:39Z) - Adaptive Stochastic Optimisation of Nonconvex Composite Objectives [2.1700203922407493]
We propose and analyse a family of generalised composite mirror descent algorithms.
With adaptive step sizes, the proposed algorithms converge without requiring prior knowledge of the problem.
We exploit the low-dimensional structure of the decision sets for high-dimensional problems.
arXiv Detail & Related papers (2022-11-21T18:31:43Z) - Adaptive Newton Sketch: Linear-time Optimization with Quadratic
Convergence and Effective Hessian Dimensionality [32.292481678592544]
We propose a randomized algorithm with quadratic convergence rate for convex optimization problems with a self-concordant, composite, strongly convex objective function.
Our first contribution is to show that, at each iteration, the embedding dimension can be as small as the effective dimension of the Hessian matrix.
This result dramatically improves on the classical linear-quadratic convergence rates of state-of-the-art sub-sampled Newton methods.
arXiv Detail & Related papers (2021-05-15T20:24:26Z) - Fast Convex Quadratic Optimization Solvers with Adaptive Sketching-based
Preconditioners [37.03247707259297]
We consider least-squares problems with quadratic regularization and propose novel sketching-based iterative methods with an adaptive sketch size.
The sketch size can be as small as the effective dimension of the data matrix to guarantee linear convergence.
A major difficulty in choosing the sketch size in terms of the effective dimension lies in the fact that the latter is usually unknown in practice.
arXiv Detail & Related papers (2021-04-29T04:36:41Z) - Convergence of adaptive algorithms for weakly convex constrained
optimization [59.36386973876765]
We prove the $mathcaltilde O(t-1/4)$ rate of convergence for the norm of the gradient of Moreau envelope.
Our analysis works with mini-batch size of $1$, constant first and second order moment parameters, and possibly smooth optimization domains.
arXiv Detail & Related papers (2020-06-11T17:43:19Z) - Optimal Randomized First-Order Methods for Least-Squares Problems [56.05635751529922]
This class of algorithms encompasses several randomized methods among the fastest solvers for least-squares problems.
We focus on two classical embeddings, namely, Gaussian projections and subsampled Hadamard transforms.
Our resulting algorithm yields the best complexity known for solving least-squares problems with no condition number dependence.
arXiv Detail & Related papers (2020-02-21T17:45:32Z) - Optimal Iterative Sketching with the Subsampled Randomized Hadamard
Transform [64.90148466525754]
We study the performance of iterative sketching for least-squares problems.
We show that the convergence rate for Haar and randomized Hadamard matrices are identical, andally improve upon random projections.
These techniques may be applied to other algorithms that employ randomized dimension reduction.
arXiv Detail & Related papers (2020-02-03T16:17:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.