Related papers: A Deterministic Streaming Sketch for Ridge Regression

A Deterministic Streaming Sketch for Ridge Regression

URL: http://arxiv.org/abs/2002.02013v4
Date: Wed, 10 Mar 2021 07:47:36 GMT
Title: A Deterministic Streaming Sketch for Ridge Regression
Authors: Benwei Shi and Jeff M. Phillips
Abstract summary: We provide a deterministic space-efficient algorithm for estimating ridge regression. This is the first $o(d2)$ space deterministic streaming algorithm with guaranteed solution error. In comparisons to randomized sketching algorithms on synthetic and real-world datasets, our algorithm has less empirical error using less space and similar time.
Score: 15.256452294422294
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We provide a deterministic space-efficient algorithm for estimating ridge regression. For $n$ data points with $d$ features and a large enough regularization parameter, we provide a solution within $\varepsilon$ L$_2$ error using only $O(d/\varepsilon)$ space. This is the first $o(d^2)$ space deterministic streaming algorithm with guaranteed solution error and risk bound for this classic problem. The algorithm sketches the covariance matrix by variants of Frequent Directions, which implies it can operate in insertion-only streams and a variety of distributed data settings. In comparisons to randomized sketching algorithms on synthetic and real-world datasets, our algorithm has less empirical error using less space and similar time.

Related papers

Differentially Private Space-Efficient Algorithms for Counting Distinct Elements in the Turnstile Model [61.40326886123332]
We give the first sublinear space differentially private algorithms for the fundamental problem of counting distinct elements in the turnstile streaming model.<n>Our result significantly improves upon the space requirements of the state-of-the-art algorithms for this problem, which is linear.<n>When a bound $W$ on the number of times an item appears in the stream is known, our algorithm provides $tildeO_eta(sqrtW)$ additive error using $tildeO_eta(sqrtW)$ space.
arXiv Detail & Related papers (2025-05-29T17:21:20Z)
Turnstile $\ell_p$ leverage score sampling with applications [56.403488578703865]
We develop a novel algorithm for sampling rows $a_i$ of a matrix $AinmathbbRntimes d$, proportional to their $ell_p$ norm, when $A$ is presented in a turnstile data stream. Our algorithm not only returns the set of sampled row indexes, it also returns slightly perturbed rows $tildea_i approx a_i$, and approximates their sampling probabilities up to $varepsilon$ relative error. For logistic regression, our framework yields the first algorithm that achieves a $
arXiv Detail & Related papers (2024-06-01T07:33:41Z)
Deterministic Nonsmooth Nonconvex Optimization [94.01526844386977]
We show that randomization is necessary to obtain a dimension-free dimension-free algorithm. Our algorithm yields the first deterministic dimension-free algorithm for optimizing ReLU networks.
arXiv Detail & Related papers (2023-02-16T13:57:19Z)
Private estimation algorithms for stochastic block models and mixture models [63.07482515700984]
General tools for designing efficient private estimation algorithms. First efficient $(epsilon, delta)$-differentially private algorithm for both weak recovery and exact recovery.
arXiv Detail & Related papers (2023-01-11T09:12:28Z)
Best Policy Identification in Linear MDPs [70.57916977441262]
We investigate the problem of best identification in discounted linear Markov+Delta Decision in the fixed confidence setting under a generative model. The lower bound as the solution of an intricate non- optimization program can be used as the starting point to devise such algorithms.
arXiv Detail & Related papers (2022-08-11T04:12:50Z)
Sketching Algorithms and Lower Bounds for Ridge Regression [65.0720777731368]
We give a sketching-based iterative algorithm that computes $1+varepsilon$ approximate solutions for the ridge regression problem. We also show that this algorithm can be used to give faster algorithms for kernel ridge regression.
arXiv Detail & Related papers (2022-04-13T22:18:47Z)
Simultaenous Sieves: A Deterministic Streaming Algorithm for Non-Monotone Submodular Maximization [16.346069386394703]
We present a deterministic, single-pass streaming algorithm for the problem of maximizing a submodular function, not necessarily a monotone, with respect to a cardinality constraint. For a monotone, single-pass streaming algorithm, our algorithm achieves in time an improvement of the best approximation factor from $1/9$ of previous literature to $approx 0.2689$.
arXiv Detail & Related papers (2020-10-27T15:22:49Z)
Streaming Complexity of SVMs [110.63976030971106]
We study the space complexity of solving the bias-regularized SVM problem in the streaming model. We show that for both problems, for dimensions of $frac1lambdaepsilon$, one can obtain streaming algorithms with spacely smaller than $frac1lambdaepsilon$.
arXiv Detail & Related papers (2020-07-07T17:10:00Z)
Efficient Algorithms for Multidimensional Segmented Regression [42.046881924063044]
We study the fundamental problem of fixed design em multidimensional regression. We provide the first sample and computationally efficient algorithm for this problem in any fixed dimension. Our algorithm relies on a simple merging iterative approach, which is novel in the multidimensional setting.
arXiv Detail & Related papers (2020-03-24T19:39:34Z)
Learning Sparse Classifiers: Continuous and Mixed Integer Optimization Perspectives [10.291482850329892]
Mixed integer programming (MIP) can be used to solve (to optimality) $ell_0$-regularized regression problems. We propose two classes of scalable algorithms: an exact algorithm that can handlepapprox 50,000$ features in a few minutes, and approximate algorithms that can address instances with $papprox6$. In addition, we present new estimation error bounds for $ell$-regularizeds.
arXiv Detail & Related papers (2020-01-17T18:47:02Z)
Optimal $\delta$-Correct Best-Arm Selection for Heavy-Tailed Distributions [2.2940141855172036]
We consider the problem of identifying the one with the maximum mean using a $delta$-correct algorithm. Lower bounds for $delta$-correct algorithms are well known. We propose a $delta$-correct algorithm that matches the lower bound as $delta$ reduces to zero.
arXiv Detail & Related papers (2019-08-24T05:31:49Z)

This list is automatically generated from the titles and abstracts of the papers in this site.