Improved guarantees and a multiple-descent curve for Column Subset
Selection and the Nystr\"om method
- URL: http://arxiv.org/abs/2002.09073v3
- Date: Fri, 18 Dec 2020 21:19:09 GMT
- Title: Improved guarantees and a multiple-descent curve for Column Subset
Selection and the Nystr\"om method
- Authors: Micha{\l} Derezi\'nski, Rajiv Khanna and Michael W. Mahoney
- Abstract summary: We develop techniques which exploit spectral properties of the data matrix to obtain improved approximation guarantees.
Our approach leads to significantly better bounds for datasets with known rates of singular value decay.
We show that both our improved bounds and the multiple-descent curve can be observed on real datasets simply by varying the RBF parameter.
- Score: 76.73096213472897
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The Column Subset Selection Problem (CSSP) and the Nystr\"om method are among
the leading tools for constructing small low-rank approximations of large
datasets in machine learning and scientific computing. A fundamental question
in this area is: how well can a data subset of size k compete with the best
rank k approximation? We develop techniques which exploit spectral properties
of the data matrix to obtain improved approximation guarantees which go beyond
the standard worst-case analysis. Our approach leads to significantly better
bounds for datasets with known rates of singular value decay, e.g., polynomial
or exponential decay. Our analysis also reveals an intriguing phenomenon: the
approximation factor as a function of k may exhibit multiple peaks and valleys,
which we call a multiple-descent curve. A lower bound we establish shows that
this behavior is not an artifact of our analysis, but rather it is an inherent
property of the CSSP and Nystr\"om tasks. Finally, using the example of a
radial basis function (RBF) kernel, we show that both our improved bounds and
the multiple-descent curve can be observed on real datasets simply by varying
the RBF parameter.
Related papers
- Recovering Manifold Structure Using Ollivier-Ricci Curvature [1.9458156037869137]
We introduce ORC-ManL, a new algorithm to prune spurious edges from nearest neighbor graphs using a criterion based on Ollivier-Ricci curvature and estimated metric distortion.
Our motivation comes from manifold learning: we show that when the data generating the nearest-neighbor graph consists of noisy samples from a low-dimensional manifold, edges that shortcut through the ambient space have more negative Ollivier-Ricci curvature than edges that lie along the data manifold.
arXiv Detail & Related papers (2024-10-02T01:00:30Z) - Entrywise Inference for Missing Panel Data: A Simple and Instance-Optimal Approach [27.301741710016223]
We consider inferential questions associated with the missing data version of panel data induced by staggered adoption.
We develop and analyze a data-driven procedure for constructing entrywise confidence intervals with pre-specified coverage.
We prove non-asymptotic and high-probability bounds on its error in estimating each missing entry.
arXiv Detail & Related papers (2024-01-24T18:58:18Z) - On the Size and Approximation Error of Distilled Sets [57.61696480305911]
We take a theoretical view on kernel ridge regression based methods of dataset distillation such as Kernel Inducing Points.
We prove that a small set of instances exists in the original input space such that its solution in the RFF space coincides with the solution of the original data.
A KRR solution can be generated using this distilled set of instances which gives an approximation towards the KRR solution optimized on the full input data.
arXiv Detail & Related papers (2023-05-23T14:37:43Z) - Approximating a RUM from Distributions on k-Slates [88.32814292632675]
We find a generalization-time algorithm that finds the RUM that best approximates the given distribution on average.
Our theoretical result can also be made practical: we obtain a that is effective and scales to real-world datasets.
arXiv Detail & Related papers (2023-05-22T17:43:34Z) - Rethinking k-means from manifold learning perspective [122.38667613245151]
We present a new clustering algorithm which directly detects clusters of data without mean estimation.
Specifically, we construct distance matrix between data points by Butterworth filter.
To well exploit the complementary information embedded in different views, we leverage the tensor Schatten p-norm regularization.
arXiv Detail & Related papers (2023-05-12T03:01:41Z) - Local Random Feature Approximations of the Gaussian Kernel [14.230653042112834]
We focus on the popular Gaussian kernel and on techniques to linearize kernel-based models by means of random feature approximations.
We show that such approaches yield poor results when modelling high-frequency data, and we propose a novel localization scheme that improves kernel approximations and downstream performance significantly.
arXiv Detail & Related papers (2022-04-12T09:52:36Z) - Spectrahedral Regression [7.2361978133966165]
We present a new approach to the problem of fitting a convex function to a data set consisting of input-output pairs.
We show that we can approximate convex functions of statistical risk statistical analysis.
We demonstrate our approach with experiments synthetic data sets as well as real data in applications such as economics engineering design.
arXiv Detail & Related papers (2021-10-27T21:21:19Z) - Bayesian Sparse Factor Analysis with Kernelized Observations [67.60224656603823]
Multi-view problems can be faced with latent variable models.
High-dimensionality and non-linear issues are traditionally handled by kernel methods.
We propose merging both approaches into single model.
arXiv Detail & Related papers (2020-06-01T14:25:38Z) - SLEIPNIR: Deterministic and Provably Accurate Feature Expansion for
Gaussian Process Regression with Derivatives [86.01677297601624]
We propose a novel approach for scaling GP regression with derivatives based on quadrature Fourier features.
We prove deterministic, non-asymptotic and exponentially fast decaying error bounds which apply for both the approximated kernel as well as the approximated posterior.
arXiv Detail & Related papers (2020-03-05T14:33:20Z) - Diversity sampling is an implicit regularization for kernel methods [13.136143245702915]
We show that Nystr"om kernel regression with diverse landmarks increases the accuracy of the regression in sparser regions of the dataset.
A greedy is also proposed to select diverse samples of significant size within large datasets when exact DPP sampling is not practically feasible.
arXiv Detail & Related papers (2020-02-20T08:24:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.