A Deterministic Streaming Sketch for Ridge Regression
- URL: http://arxiv.org/abs/2002.02013v4
- Date: Wed, 10 Mar 2021 07:47:36 GMT
- Title: A Deterministic Streaming Sketch for Ridge Regression
- Authors: Benwei Shi and Jeff M. Phillips
- Abstract summary: We provide a deterministic space-efficient algorithm for estimating ridge regression.
This is the first $o(d2)$ space deterministic streaming algorithm with guaranteed solution error.
In comparisons to randomized sketching algorithms on synthetic and real-world datasets, our algorithm has less empirical error using less space and similar time.
- Score: 15.256452294422294
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We provide a deterministic space-efficient algorithm for estimating ridge
regression. For $n$ data points with $d$ features and a large enough
regularization parameter, we provide a solution within $\varepsilon$ L$_2$
error using only $O(d/\varepsilon)$ space. This is the first $o(d^2)$ space
deterministic streaming algorithm with guaranteed solution error and risk bound
for this classic problem. The algorithm sketches the covariance matrix by
variants of Frequent Directions, which implies it can operate in insertion-only
streams and a variety of distributed data settings. In comparisons to
randomized sketching algorithms on synthetic and real-world datasets, our
algorithm has less empirical error using less space and similar time.
Related papers
- Turnstile $\ell_p$ leverage score sampling with applications [56.403488578703865]
We develop a novel algorithm for sampling rows $a_i$ of a matrix $AinmathbbRntimes d$, proportional to their $ell_p$ norm, when $A$ is presented in a turnstile data stream.
Our algorithm not only returns the set of sampled row indexes, it also returns slightly perturbed rows $tildea_i approx a_i$, and approximates their sampling probabilities up to $varepsilon$ relative error.
For logistic regression, our framework yields the first algorithm that achieves a $
arXiv Detail & Related papers (2024-06-01T07:33:41Z) - Deterministic Nonsmooth Nonconvex Optimization [94.01526844386977]
We show that randomization is necessary to obtain a dimension-free dimension-free algorithm.
Our algorithm yields the first deterministic dimension-free algorithm for optimizing ReLU networks.
arXiv Detail & Related papers (2023-02-16T13:57:19Z) - Private estimation algorithms for stochastic block models and mixture
models [63.07482515700984]
General tools for designing efficient private estimation algorithms.
First efficient $(epsilon, delta)$-differentially private algorithm for both weak recovery and exact recovery.
arXiv Detail & Related papers (2023-01-11T09:12:28Z) - Best Policy Identification in Linear MDPs [70.57916977441262]
We investigate the problem of best identification in discounted linear Markov+Delta Decision in the fixed confidence setting under a generative model.
The lower bound as the solution of an intricate non- optimization program can be used as the starting point to devise such algorithms.
arXiv Detail & Related papers (2022-08-11T04:12:50Z) - Sketching Algorithms and Lower Bounds for Ridge Regression [65.0720777731368]
We give a sketching-based iterative algorithm that computes $1+varepsilon$ approximate solutions for the ridge regression problem.
We also show that this algorithm can be used to give faster algorithms for kernel ridge regression.
arXiv Detail & Related papers (2022-04-13T22:18:47Z) - Simultaenous Sieves: A Deterministic Streaming Algorithm for
Non-Monotone Submodular Maximization [16.346069386394703]
We present a deterministic, single-pass streaming algorithm for the problem of maximizing a submodular function, not necessarily a monotone, with respect to a cardinality constraint.
For a monotone, single-pass streaming algorithm, our algorithm achieves in time an improvement of the best approximation factor from $1/9$ of previous literature to $approx 0.2689$.
arXiv Detail & Related papers (2020-10-27T15:22:49Z) - Streaming Complexity of SVMs [110.63976030971106]
We study the space complexity of solving the bias-regularized SVM problem in the streaming model.
We show that for both problems, for dimensions of $frac1lambdaepsilon$, one can obtain streaming algorithms with spacely smaller than $frac1lambdaepsilon$.
arXiv Detail & Related papers (2020-07-07T17:10:00Z) - Efficient Algorithms for Multidimensional Segmented Regression [42.046881924063044]
We study the fundamental problem of fixed design em multidimensional regression.
We provide the first sample and computationally efficient algorithm for this problem in any fixed dimension.
Our algorithm relies on a simple merging iterative approach, which is novel in the multidimensional setting.
arXiv Detail & Related papers (2020-03-24T19:39:34Z) - Learning Sparse Classifiers: Continuous and Mixed Integer Optimization
Perspectives [10.291482850329892]
Mixed integer programming (MIP) can be used to solve (to optimality) $ell_0$-regularized regression problems.
We propose two classes of scalable algorithms: an exact algorithm that can handlepapprox 50,000$ features in a few minutes, and approximate algorithms that can address instances with $papprox6$.
In addition, we present new estimation error bounds for $ell$-regularizeds.
arXiv Detail & Related papers (2020-01-17T18:47:02Z) - Optimal $\delta$-Correct Best-Arm Selection for Heavy-Tailed
Distributions [2.2940141855172036]
We consider the problem of identifying the one with the maximum mean using a $delta$-correct algorithm.
Lower bounds for $delta$-correct algorithms are well known.
We propose a $delta$-correct algorithm that matches the lower bound as $delta$ reduces to zero.
arXiv Detail & Related papers (2019-08-24T05:31:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.