Related papers: Error Estimation for Sketched SVD via the Bootstrap

Error Estimation for Sketched SVD via the Bootstrap

URL: http://arxiv.org/abs/2003.04937v1
Date: Tue, 10 Mar 2020 19:14:08 GMT
Title: Error Estimation for Sketched SVD via the Bootstrap
Authors: Miles E. Lopes and N. Benjamin Erichson and Michael W. Mahoney
Abstract summary: This paper develops a fully data-driven bootstrap method that numerically estimates the actual error of sketched singular vectors/values. The method is computationally inexpensive, because it operates only on sketched objects, and it requires no passes over the full matrix being factored.
Score: 60.67199274260768
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: In order to compute fast approximations to the singular value decompositions (SVD) of very large matrices, randomized sketching algorithms have become a leading approach. However, a key practical difficulty of sketching an SVD is that the user does not know how far the sketched singular vectors/values are from the exact ones. Indeed, the user may be forced to rely on analytical worst-case error bounds, which do not account for the unique structure of a given problem. As a result, the lack of tools for error estimation often leads to much more computation than is really necessary. To overcome these challenges, this paper develops a fully data-driven bootstrap method that numerically estimates the actual error of sketched singular vectors/values. In particular, this allows the user to inspect the quality of a rough initial sketched SVD, and then adaptively predict how much extra work is needed to reach a given error tolerance. Furthermore, the method is computationally inexpensive, because it operates only on sketched objects, and it requires no passes over the full matrix being factored. Lastly, the method is supported by theoretical guarantees and a very encouraging set of experimental results.

Related papers

Robust Capped lp-Norm Support Vector Ordinal Regression [85.84718111830752]
Ordinal regression is a specialized supervised problem where the labels show an inherent order. Support Vector Ordinal Regression, as an outstanding ordinal regression model, is widely used in many ordinal regression tasks. We introduce a new model, Capped $ell_p$-Norm Support Vector Ordinal Regression(CSVOR), that is robust to outliers.
arXiv Detail & Related papers (2024-04-25T13:56:05Z)
Robust leave-one-out cross-validation for high-dimensional Bayesian models [0.0]
Leave-one-out cross-validation (LOO-CV) is a popular method for estimating out-of-sample predictive accuracy. Here we propose and analyze a novel mixture estimator to compute LOO-CV criteria. Our method retains the simplicity and computational convenience of classical approaches, while guaranteeing finite variance of the resulting estimators.
arXiv Detail & Related papers (2022-09-19T17:14:52Z)
Minimax rate of consistency for linear models with missing values [0.0]
Missing values arise in most real-world data sets due to the aggregation of multiple sources and intrinsically missing information (sensor failure, unanswered questions in surveys...). In this paper, we focus on the extensively-studied linear models, but in presence of missing values, which turns out to be quite a challenging task. This eventually requires to solve a number of learning tasks, exponential in the number of input features, which makes predictions impossible for current real-world datasets.
arXiv Detail & Related papers (2022-02-03T08:45:34Z)
Faster Algorithms and Constant Lower Bounds for the Worst-Case Expected Error [0.3997680012976965]
The goal is to design estimators that minimize the worst-case expected error. Chen, Valiant and Valiant show that, when data values are $ell_infty$-normalized, there is a time algorithm to compute an estimator for the mean. In this paper we design provably efficient algorithms for approximating the optimal semilinear estimator based on online convex optimization.
arXiv Detail & Related papers (2021-12-27T18:47:25Z)
Generalization of Neural Combinatorial Solvers Through the Lens of Adversarial Robustness [68.97830259849086]
Most datasets only capture a simpler subproblem and likely suffer from spurious features. We study adversarial robustness - a local generalization property - to reveal hard, model-specific instances and spurious features. Unlike in other applications, where perturbation models are designed around subjective notions of imperceptibility, our perturbation models are efficient and sound. Surprisingly, with such perturbations, a sufficiently expressive neural solver does not suffer from the limitations of the accuracy-robustness trade-off common in supervised learning.
arXiv Detail & Related papers (2021-10-21T07:28:11Z)
Imputation-Free Learning from Incomplete Observations [73.15386629370111]
We introduce the importance of guided gradient descent (IGSGD) method to train inference from inputs containing missing values without imputation. We employ reinforcement learning (RL) to adjust the gradients used to train the models via back-propagation. Our imputation-free predictions outperform the traditional two-step imputation-based predictions using state-of-the-art imputation methods.
arXiv Detail & Related papers (2021-07-05T12:44:39Z)
Accurate and fast matrix factorization for low-rank learning [4.435094091999926]
We tackle two important challenges related to the accurate partial singular value decomposition (SVD) and numerical rank estimation of a huge matrix. We use the concepts of Krylov subspaces such as the Golub-Kahan bidiagonalization process as well as Ritz vectors to achieve these goals.
arXiv Detail & Related papers (2021-04-21T22:35:02Z)
Robust Differentiable SVD [117.35644933471401]
Eigendecomposition of symmetric matrices is at the heart of many computer vision algorithms. Instability arises in the presence of eigenvalues that are close to each other. We show that the Taylor expansion of the SVD gradient is theoretically equivalent to the gradient obtained using PI without relying on an iterative process.
arXiv Detail & Related papers (2021-04-08T15:04:15Z)
Sparse PCA via $l_{2,p}$-Norm Regularization for Unsupervised Feature Selection [138.97647716793333]
We propose a simple and efficient unsupervised feature selection method, by combining reconstruction error with $l_2,p$-norm regularization. We present an efficient optimization algorithm to solve the proposed unsupervised model, and analyse the convergence and computational complexity of the algorithm theoretically.
arXiv Detail & Related papers (2020-12-29T04:08:38Z)
Matrix Completion with Quantified Uncertainty through Low Rank Gaussian Copula [30.84155327760468]
This paper proposes a framework for missing value imputation with quantified uncertainty. The time required to fit the model scales linearly with the number of rows and the number of columns in the dataset. Empirical results show the method yields state-of-the-art imputation accuracy across a wide range of data types.
arXiv Detail & Related papers (2020-06-18T19:51:42Z)
Lower Bounds and a Near-Optimal Shrinkage Estimator for Least Squares using Random Projections [37.27708297562079]
We derive an improved estimator for the least squares solution using the Gaussian sketch. Empirically, this estimator achieves smaller error on simulated and real datasets.
arXiv Detail & Related papers (2020-06-15T06:42:18Z)

This list is automatically generated from the titles and abstracts of the papers in this site.