Learning-Augmented Sketches for Hessians
- URL: http://arxiv.org/abs/2102.12317v1
- Date: Wed, 24 Feb 2021 14:50:59 GMT
- Title: Learning-Augmented Sketches for Hessians
- Authors: Yi Li, Honghao Lin, David P. Woodruff
- Abstract summary: We show how to design learned sketches for the Hessian in the context of second order methods.
We show empirically that learned sketches, compared with their "non-learned" counterparts, improve the approximation accuracy for important problems.
- Score: 54.97773807211337
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Sketching is a dimensionality reduction technique where one compresses a
matrix by linear combinations that are typically chosen at random. A line of
work has shown how to sketch the Hessian to speed up each iteration in a second
order method, but such sketches usually depend only on the matrix at hand, and
in a number of cases are even oblivious to the input matrix. One could instead
hope to learn a distribution on sketching matrices that is optimized for the
specific distribution of input matrices. We show how to design learned sketches
for the Hessian in the context of second order methods, where we learn
potentially different sketches for the different iterations of an optimization
procedure. We show empirically that learned sketches, compared with their
"non-learned" counterparts, improve the approximation accuracy for important
problems, including LASSO, SVM, and matrix estimation with nuclear norm
constraints. Several of our schemes can be proven to perform no worse than
their unlearned counterparts. Additionally, we show that a smaller sketching
dimension of the column space of a tall matrix is possible, assuming an oracle
for predicting rows which have a large leverage score.
Related papers
- Learning the Positions in CountSketch [49.57951567374372]
We consider sketching algorithms which first compress data by multiplication with a random sketch matrix, and then apply the sketch to quickly solve an optimization problem.
In this work, we propose the first learning-based algorithms that also optimize the locations of the non-zero entries.
arXiv Detail & Related papers (2023-06-11T07:28:35Z) - High-Dimensional Sparse Bayesian Learning without Covariance Matrices [66.60078365202867]
We introduce a new inference scheme that avoids explicit construction of the covariance matrix.
Our approach couples a little-known diagonal estimation result from numerical linear algebra with the conjugate gradient algorithm.
On several simulations, our method scales better than existing approaches in computation time and memory.
arXiv Detail & Related papers (2022-02-25T16:35:26Z) - Non-PSD Matrix Sketching with Applications to Regression and
Optimization [56.730993511802865]
We present dimensionality reduction methods for non-PSD and square-roots" matrices.
We show how these techniques can be used for multiple downstream tasks.
arXiv Detail & Related papers (2021-06-16T04:07:48Z) - Accumulations of Projections--A Unified Framework for Random Sketches in
Kernel Ridge Regression [12.258887270632869]
Building a sketch of an n-by-n empirical kernel matrix is a common approach to accelerate the computation of many kernel methods.
We propose a unified framework of constructing sketching methods in kernel ridge regression.
arXiv Detail & Related papers (2021-03-06T05:02:17Z) - Effective and Sparse Count-Sketch via k-means clustering [17.26941311020711]
We propose to reduce the reconstruction error of count-sketch by using k-means clustering algorithm to obtain the low-dimensional sketched matrix.
Our experimental results based on six real-life classification datasets have demonstrated that our proposed method achieves higher accuracy than the original count-sketch.
arXiv Detail & Related papers (2020-11-24T11:44:02Z) - Learning the Positions in CountSketch [51.15935547615698]
We consider sketching algorithms which first compress data by multiplication with a random sketch matrix, and then apply the sketch to quickly solve an optimization problem.
In this work we propose the first learning algorithm that also optimize the locations of the non-zero entries.
We show this algorithm gives better accuracy for low rank approximation than previous work, and apply it to other problems such as $k$-means clustering for the first time.
arXiv Detail & Related papers (2020-07-20T05:06:29Z) - Localized sketching for matrix multiplication and ridge regression [29.61816941867831]
We consider sketched approximate matrix multiplication and ridge regression in the novel setting of localized sketching.
We show that, under mild conditions, block diagonal sketching matrices require only O(stable rank / epsilon2) and $O( stat. dim. epsilon)$ total sample complexity for matrix multiplication and ridge regression.
arXiv Detail & Related papers (2020-03-20T04:25:32Z) - Optimal Iterative Sketching with the Subsampled Randomized Hadamard
Transform [64.90148466525754]
We study the performance of iterative sketching for least-squares problems.
We show that the convergence rate for Haar and randomized Hadamard matrices are identical, andally improve upon random projections.
These techniques may be applied to other algorithms that employ randomized dimension reduction.
arXiv Detail & Related papers (2020-02-03T16:17:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.