Related papers: Learning the Positions in CountSketch

Learning the Positions in CountSketch

URL: http://arxiv.org/abs/2007.09890v3
Date: Mon, 7 Jun 2021 07:30:46 GMT
Title: Learning the Positions in CountSketch
Authors: Simin Liu, Tianrui Liu, Ali Vakilian, Yulin Wan, David P. Woodruff
Abstract summary: We consider sketching algorithms which first compress data by multiplication with a random sketch matrix, and then apply the sketch to quickly solve an optimization problem. In this work we propose the first learning algorithm that also optimize the locations of the non-zero entries. We show this algorithm gives better accuracy for low rank approximation than previous work, and apply it to other problems such as $k$-means clustering for the first time.
Score: 51.15935547615698
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We consider sketching algorithms which first quickly compress data by multiplication with a random sketch matrix, and then apply the sketch to quickly solve an optimization problem, e.g., low rank approximation. In the learning-based sketching paradigm proposed by Indyk et al. [2019], the sketch matrix is found by choosing a random sparse matrix, e.g., the CountSketch, and then updating the values of the non-zero entries by running gradient descent on a training data set. Despite the growing body of work on this paradigm, a noticeable omission is that the locations of the non-zero entries of previous algorithms were fixed, and only their values were learned. In this work we propose the first learning algorithm that also optimizes the locations of the non-zero entries. We show this algorithm gives better accuracy for low rank approximation than previous work, and apply it to other problems such as $k$-means clustering for the first time. We show that our algorithm is provably better in the spiked covariance model and for Zipfian matrices. We also show the importance of the sketch monotonicity property for combining learned sketches. Our empirical results show the importance of optimizing not only the values of the non-zero entries but also their positions.

Related papers

Learning the Positions in CountSketch [49.57951567374372]
We consider sketching algorithms which first compress data by multiplication with a random sketch matrix, and then apply the sketch to quickly solve an optimization problem. In this work, we propose the first learning-based algorithms that also optimize the locations of the non-zero entries.
arXiv Detail & Related papers (2023-06-11T07:28:35Z)
Learning Sparsity and Randomness for Data-driven Low Rank Approximation [0.0]
Learning-based low rank approximation algorithms can significantly improve the performance of randomized low rank approximation with sketch matrix. We introduce two new methods Learning Sparsity and Learning Randomness which try to learn a better sparsity patterns and add randomness to the value of sketch matrix.
arXiv Detail & Related papers (2022-12-15T23:12:53Z)
Estimating leverage scores via rank revealing methods and randomization [50.591267188664666]
We study algorithms for estimating the statistical leverage scores of rectangular dense or sparse matrices of arbitrary rank. Our approach is based on combining rank revealing methods with compositions of dense and sparse randomized dimensionality reduction transforms.
arXiv Detail & Related papers (2021-05-23T19:21:55Z)
Accumulations of Projections--A Unified Framework for Random Sketches in Kernel Ridge Regression [12.258887270632869]
Building a sketch of an n-by-n empirical kernel matrix is a common approach to accelerate the computation of many kernel methods. We propose a unified framework of constructing sketching methods in kernel ridge regression.
arXiv Detail & Related papers (2021-03-06T05:02:17Z)
Learning-Augmented Sketches for Hessians [54.97773807211337]
We show how to design learned sketches for the Hessian in the context of second order methods. We show empirically that learned sketches, compared with their "non-learned" counterparts, improve the approximation accuracy for important problems.
arXiv Detail & Related papers (2021-02-24T14:50:59Z)
Effective and Sparse Count-Sketch via k-means clustering [17.26941311020711]
We propose to reduce the reconstruction error of count-sketch by using k-means clustering algorithm to obtain the low-dimensional sketched matrix. Our experimental results based on six real-life classification datasets have demonstrated that our proposed method achieves higher accuracy than the original count-sketch.
arXiv Detail & Related papers (2020-11-24T11:44:02Z)
Optimal Randomized First-Order Methods for Least-Squares Problems [56.05635751529922]
This class of algorithms encompasses several randomized methods among the fastest solvers for least-squares problems. We focus on two classical embeddings, namely, Gaussian projections and subsampled Hadamard transforms. Our resulting algorithm yields the best complexity known for solving least-squares problems with no condition number dependence.
arXiv Detail & Related papers (2020-02-21T17:45:32Z)
Optimal Iterative Sketching with the Subsampled Randomized Hadamard Transform [64.90148466525754]
We study the performance of iterative sketching for least-squares problems. We show that the convergence rate for Haar and randomized Hadamard matrices are identical, andally improve upon random projections. These techniques may be applied to other algorithms that employ randomized dimension reduction.
arXiv Detail & Related papers (2020-02-03T16:17:50Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.