Adaptive Newton Sketch: Linear-time Optimization with Quadratic
Convergence and Effective Hessian Dimensionality
- URL: http://arxiv.org/abs/2105.07291v1
- Date: Sat, 15 May 2021 20:24:26 GMT
- Title: Adaptive Newton Sketch: Linear-time Optimization with Quadratic
Convergence and Effective Hessian Dimensionality
- Authors: Jonathan Lacotte, Yifei Wang, Mert Pilanci
- Abstract summary: We propose a randomized algorithm with quadratic convergence rate for convex optimization problems with a self-concordant, composite, strongly convex objective function.
Our first contribution is to show that, at each iteration, the embedding dimension can be as small as the effective dimension of the Hessian matrix.
This result dramatically improves on the classical linear-quadratic convergence rates of state-of-the-art sub-sampled Newton methods.
- Score: 32.292481678592544
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We propose a randomized algorithm with quadratic convergence rate for convex
optimization problems with a self-concordant, composite, strongly convex
objective function. Our method is based on performing an approximate Newton
step using a random projection of the Hessian. Our first contribution is to
show that, at each iteration, the embedding dimension (or sketch size) can be
as small as the effective dimension of the Hessian matrix. Leveraging this
novel fundamental result, we design an algorithm with a sketch size
proportional to the effective dimension and which exhibits a quadratic rate of
convergence. This result dramatically improves on the classical
linear-quadratic convergence rates of state-of-the-art sub-sampled Newton
methods. However, in most practical cases, the effective dimension is not known
beforehand, and this raises the question of how to pick a sketch size as small
as the effective dimension while preserving a quadratic convergence rate. Our
second and main contribution is thus to propose an adaptive sketch size
algorithm with quadratic convergence rate and which does not require prior
knowledge or estimation of the effective dimension: at each iteration, it
starts with a small sketch size, and increases it until quadratic progress is
achieved. Importantly, we show that the embedding dimension remains
proportional to the effective dimension throughout the entire path and that our
method achieves state-of-the-art computational complexity for solving convex
optimization programs with a strongly convex component.
Related papers
- Linearization Algorithms for Fully Composite Optimization [61.20539085730636]
This paper studies first-order algorithms for solving fully composite optimization problems convex compact sets.
We leverage the structure of the objective by handling differentiable and non-differentiable separately, linearizing only the smooth parts.
arXiv Detail & Related papers (2023-02-24T18:41:48Z) - Fast Computation of Optimal Transport via Entropy-Regularized Extragradient Methods [75.34939761152587]
Efficient computation of the optimal transport distance between two distributions serves as an algorithm that empowers various applications.
This paper develops a scalable first-order optimization-based method that computes optimal transport to within $varepsilon$ additive accuracy.
arXiv Detail & Related papers (2023-01-30T15:46:39Z) - Sharp Analysis of Sketch-and-Project Methods via a Connection to
Randomized Singular Value Decomposition [14.453949553412821]
We develop a theoretical framework for obtaining sharp guarantees on the convergence rate of sketch-and-project methods.
We show that the convergence rate improves at least linearly with the sketch size, and even faster when the data matrix exhibits certain spectral decays.
Our experiments support the theory and demonstrate that even extremely sparse sketches exhibit the convergence properties predicted by our framework.
arXiv Detail & Related papers (2022-08-20T03:11:13Z) - Multiple Convex Objects Image Segmentation via Proximal Alternating
Direction Method of Multipliers [2.294014185517203]
The convex shape prior turns out to be a simple quadratic inequality constraint on the binary indicator function associated with each object.
An image segmentation model incorporating convex shape prior into a probability-based method is proposed.
Numerical experiments on natural and medical images demonstrate that the proposed method is superior to some existing methods.
arXiv Detail & Related papers (2022-03-22T00:05:19Z) - Distributed Sketching for Randomized Optimization: Exact
Characterization, Concentration and Lower Bounds [54.51566432934556]
We consider distributed optimization methods for problems where forming the Hessian is computationally challenging.
We leverage randomized sketches for reducing the problem dimensions as well as preserving privacy and improving straggler resilience in asynchronous distributed systems.
arXiv Detail & Related papers (2022-03-18T05:49:13Z) - Newton-LESS: Sparsification without Trade-offs for the Sketched Newton
Update [88.73437209862891]
In second-order optimization, a potential bottleneck can be computing the Hessian matrix of the optimized function at every iteration.
We show that the Gaussian sketching matrix can be drastically sparsified, significantly reducing the computational cost of sketching.
We prove that Newton-LESS enjoys nearly the same problem-independent local convergence rate as Gaussian embeddings.
arXiv Detail & Related papers (2021-07-15T17:33:05Z) - Fast Convex Quadratic Optimization Solvers with Adaptive Sketching-based
Preconditioners [37.03247707259297]
We consider least-squares problems with quadratic regularization and propose novel sketching-based iterative methods with an adaptive sketch size.
The sketch size can be as small as the effective dimension of the data matrix to guarantee linear convergence.
A major difficulty in choosing the sketch size in terms of the effective dimension lies in the fact that the latter is usually unknown in practice.
arXiv Detail & Related papers (2021-04-29T04:36:41Z) - Effective Dimension Adaptive Sketching Methods for Faster Regularized
Least-Squares Optimization [56.05635751529922]
We propose a new randomized algorithm for solving L2-regularized least-squares problems based on sketching.
We consider two of the most popular random embeddings, namely, Gaussian embeddings and the Subsampled Randomized Hadamard Transform (SRHT)
arXiv Detail & Related papers (2020-06-10T15:00:09Z) - Optimal Randomized First-Order Methods for Least-Squares Problems [56.05635751529922]
This class of algorithms encompasses several randomized methods among the fastest solvers for least-squares problems.
We focus on two classical embeddings, namely, Gaussian projections and subsampled Hadamard transforms.
Our resulting algorithm yields the best complexity known for solving least-squares problems with no condition number dependence.
arXiv Detail & Related papers (2020-02-21T17:45:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.