SURF: A Simple, Universal, Robust, Fast Distribution Learning Algorithm
- URL: http://arxiv.org/abs/2002.09589v2
- Date: Thu, 11 Feb 2021 19:10:43 GMT
- Title: SURF: A Simple, Universal, Robust, Fast Distribution Learning Algorithm
- Authors: Yi Hao, Ayush Jain, Alon Orlitsky, Vaishakh Ravindrakumar
- Abstract summary: SURF is an algorithm for approximating distributions by piecewises.
It outperforms state-of-the-art algorithms in experiments.
- Score: 64.13217062232874
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Sample- and computationally-efficient distribution estimation is a
fundamental tenet in statistics and machine learning. We present SURF, an
algorithm for approximating distributions by piecewise polynomials. SURF is:
simple, replacing prior complex optimization techniques by straight-forward
{empirical probability} approximation of each potential polynomial piece
{through simple empirical-probability interpolation}, and using plain
divide-and-conquer to merge the pieces; universal, as well-known
polynomial-approximation results imply that it accurately approximates a large
class of common distributions; robust to distribution mis-specification as for
any degree $d \le 8$, it estimates any distribution to an $\ell_1$ distance $<
3$ times that of the nearest degree-$d$ piecewise polynomial, improving known
factor upper bounds of 3 for single polynomials and 15 for polynomials with
arbitrarily many pieces; fast, using optimal sample complexity, running in near
sample-linear time, and if given sorted samples it may be parallelized to run
in sub-linear time. In experiments, SURF outperforms state-of-the art
algorithms.
Related papers
- Learning Submodular Sequencing from Samples [11.528995186765751]
This paper addresses the problem of selecting and ranking items in a sequence to optimize some composite submodular function.
We present an algorithm that achieves an approximation ratio dependent on the curvature of the individual submodular functions.
arXiv Detail & Related papers (2024-09-09T01:33:13Z) - Polynomial-Time Solutions for ReLU Network Training: A Complexity
Classification via Max-Cut and Zonotopes [70.52097560486683]
We prove that the hardness of approximation of ReLU networks not only mirrors the complexity of the Max-Cut problem but also, in certain special cases, exactly corresponds to it.
In particular, when $epsilonleqsqrt84/83-1approx 0.006$, we show that it is NP-hard to find an approximate global dataset of the ReLU network objective with relative error $epsilon$ with respect to the objective value.
arXiv Detail & Related papers (2023-11-18T04:41:07Z) - Simple Binary Hypothesis Testing under Local Differential Privacy and
Communication Constraints [8.261182037130407]
We study simple binary hypothesis testing under both local differential privacy (LDP) and communication constraints.
We qualify our results as either minimax optimal or instance optimal.
arXiv Detail & Related papers (2023-01-09T18:36:49Z) - On-Demand Sampling: Learning Optimally from Multiple Distributions [63.20009081099896]
Social and real-world considerations have given rise to multi-distribution learning paradigms.
We establish the optimal sample complexity of these learning paradigms and give algorithms that meet this sample complexity.
Our algorithm design and analysis are enabled by our extensions of online learning techniques for solving zero-sum games.
arXiv Detail & Related papers (2022-10-22T19:07:26Z) - Optimal Randomized Approximations for Matrix based Renyi's Entropy [16.651155375441796]
We develop random approximations for integer order $alpha$ cases and series approximations for non-integer $alpha$ cases.
Large-scale simulations and real-world applications validate the effectiveness of the developed approximations.
arXiv Detail & Related papers (2022-05-16T02:24:52Z) - TURF: A Two-factor, Universal, Robust, Fast Distribution Learning
Algorithm [64.13217062232874]
One of its most powerful and successful modalities approximates every distribution to an $ell$ distance essentially at most a constant times larger than its closest $t$-piece degree-$d_$.
We provide a method that estimates this number near-optimally, hence helps approach the best possible approximation.
arXiv Detail & Related papers (2022-02-15T03:49:28Z) - Forster Decomposition and Learning Halfspaces with Noise [60.691817861402676]
A Forster transform is an operation that turns a distribution into one with good anti-concentration properties.
We show that any distribution can be decomposed efficiently as a disjoint mixture of few distributions for which a Forster transform exists and can be computed efficiently.
arXiv Detail & Related papers (2021-07-12T17:00:59Z) - Linear-Sample Learning of Low-Rank Distributions [56.59844655107251]
We show that learning $ktimes k$, rank-$r$, matrices to normalized $L_1$ distance requires $Omega(frackrepsilon2)$ samples.
We propose an algorithm that uses $cal O(frackrepsilon2log2fracepsilon)$ samples, a number linear in the high dimension, and nearly linear in the matrices, typically low, rank proofs.
arXiv Detail & Related papers (2020-09-30T19:10:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.