Learning the Positions in CountSketch
- URL: http://arxiv.org/abs/2007.09890v3
- Date: Mon, 7 Jun 2021 07:30:46 GMT
- Title: Learning the Positions in CountSketch
- Authors: Simin Liu, Tianrui Liu, Ali Vakilian, Yulin Wan, David P. Woodruff
- Abstract summary: We consider sketching algorithms which first compress data by multiplication with a random sketch matrix, and then apply the sketch to quickly solve an optimization problem.
In this work we propose the first learning algorithm that also optimize the locations of the non-zero entries.
We show this algorithm gives better accuracy for low rank approximation than previous work, and apply it to other problems such as $k$-means clustering for the first time.
- Score: 51.15935547615698
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We consider sketching algorithms which first quickly compress data by
multiplication with a random sketch matrix, and then apply the sketch to
quickly solve an optimization problem, e.g., low rank approximation. In the
learning-based sketching paradigm proposed by Indyk et al. [2019], the sketch
matrix is found by choosing a random sparse matrix, e.g., the CountSketch, and
then updating the values of the non-zero entries by running gradient descent on
a training data set. Despite the growing body of work on this paradigm, a
noticeable omission is that the locations of the non-zero entries of previous
algorithms were fixed, and only their values were learned. In this work we
propose the first learning algorithm that also optimizes the locations of the
non-zero entries. We show this algorithm gives better accuracy for low rank
approximation than previous work, and apply it to other problems such as
$k$-means clustering for the first time. We show that our algorithm is provably
better in the spiked covariance model and for Zipfian matrices. We also show
the importance of the sketch monotonicity property for combining learned
sketches. Our empirical results show the importance of optimizing not only the
values of the non-zero entries but also their positions.
Related papers
- Learning the Positions in CountSketch [49.57951567374372]
We consider sketching algorithms which first compress data by multiplication with a random sketch matrix, and then apply the sketch to quickly solve an optimization problem.
In this work, we propose the first learning-based algorithms that also optimize the locations of the non-zero entries.
arXiv Detail & Related papers (2023-06-11T07:28:35Z) - Learning Sparsity and Randomness for Data-driven Low Rank Approximation [0.0]
Learning-based low rank approximation algorithms can significantly improve the performance of randomized low rank approximation with sketch matrix.
We introduce two new methods Learning Sparsity and Learning Randomness which try to learn a better sparsity patterns and add randomness to the value of sketch matrix.
arXiv Detail & Related papers (2022-12-15T23:12:53Z) - Estimating leverage scores via rank revealing methods and randomization [50.591267188664666]
We study algorithms for estimating the statistical leverage scores of rectangular dense or sparse matrices of arbitrary rank.
Our approach is based on combining rank revealing methods with compositions of dense and sparse randomized dimensionality reduction transforms.
arXiv Detail & Related papers (2021-05-23T19:21:55Z) - Accumulations of Projections--A Unified Framework for Random Sketches in
Kernel Ridge Regression [12.258887270632869]
Building a sketch of an n-by-n empirical kernel matrix is a common approach to accelerate the computation of many kernel methods.
We propose a unified framework of constructing sketching methods in kernel ridge regression.
arXiv Detail & Related papers (2021-03-06T05:02:17Z) - Learning-Augmented Sketches for Hessians [54.97773807211337]
We show how to design learned sketches for the Hessian in the context of second order methods.
We show empirically that learned sketches, compared with their "non-learned" counterparts, improve the approximation accuracy for important problems.
arXiv Detail & Related papers (2021-02-24T14:50:59Z) - Effective and Sparse Count-Sketch via k-means clustering [17.26941311020711]
We propose to reduce the reconstruction error of count-sketch by using k-means clustering algorithm to obtain the low-dimensional sketched matrix.
Our experimental results based on six real-life classification datasets have demonstrated that our proposed method achieves higher accuracy than the original count-sketch.
arXiv Detail & Related papers (2020-11-24T11:44:02Z) - Optimal Randomized First-Order Methods for Least-Squares Problems [56.05635751529922]
This class of algorithms encompasses several randomized methods among the fastest solvers for least-squares problems.
We focus on two classical embeddings, namely, Gaussian projections and subsampled Hadamard transforms.
Our resulting algorithm yields the best complexity known for solving least-squares problems with no condition number dependence.
arXiv Detail & Related papers (2020-02-21T17:45:32Z) - Optimal Iterative Sketching with the Subsampled Randomized Hadamard
Transform [64.90148466525754]
We study the performance of iterative sketching for least-squares problems.
We show that the convergence rate for Haar and randomized Hadamard matrices are identical, andally improve upon random projections.
These techniques may be applied to other algorithms that employ randomized dimension reduction.
arXiv Detail & Related papers (2020-02-03T16:17:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.