Downsampling for Testing and Learning in Product Distributions
- URL: http://arxiv.org/abs/2007.07449v2
- Date: Mon, 15 Nov 2021 20:53:31 GMT
- Title: Downsampling for Testing and Learning in Product Distributions
- Authors: Nathaniel Harms, Yuichi Yoshida
- Abstract summary: We study distribution-free property testing and learning problems where the unknown probability distribution is a product distribution over $mathbbRd$
For many important classes of functions, such as intersections of halfspaces, threshold functions, convex sets, and $k$-alternating functions, known algorithms either have complexity that depends on the support size of the distribution.
We introduce a general method, which we call downlog, that resolves these issues.
- Score: 24.48103093661132
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study distribution-free property testing and learning problems where the
unknown probability distribution is a product distribution over $\mathbb{R}^d$.
For many important classes of functions, such as intersections of halfspaces,
polynomial threshold functions, convex sets, and $k$-alternating functions, the
known algorithms either have complexity that depends on the support size of the
distribution, or are proven to work only for specific examples of product
distributions. We introduce a general method, which we call downsampling, that
resolves these issues. Downsampling uses a notion of "rectilinear isoperimetry"
for product distributions, which further strengthens the connection between
isoperimetry, testing, and learning. Using this technique, we attain new
efficient distribution-free algorithms under product distributions on
$\mathbb{R}^d$:
1. A simpler proof for non-adaptive, one-sided monotonicity testing of
functions $[n]^d \to \{0,1\}$, and improved sample complexity for testing
monotonicity over unknown product distributions, from $O(d^7)$ [Black,
Chakrabarty, & Seshadhri, SODA 2020] to $\widetilde O(d^3)$.
2. Polynomial-time agnostic learning algorithms for functions of a constant
number of halfspaces, and constant-degree polynomial threshold functions.
3. An $\exp(O(d \log(dk)))$-time agnostic learning algorithm, and an
$\exp(O(d \log(dk)))$-sample tolerant tester, for functions of $k$ convex sets;
and a $2^{\widetilde O(d)}$ sample-based one-sided tester for convex sets.
4. An $\exp(\widetilde O(k \sqrt d))$-time agnostic learning algorithm for
$k$-alternating functions, and a sample-based tolerant tester with the same
complexity.
Related papers
- Learning Intersections of Halfspaces with Distribution Shift: Improved Algorithms and SQ Lower Bounds [9.036777309376697]
Klivans, Stavropoulos, and Vasilyan initiated the study of testable learning with distribution shift.
Their model deviates from all prior work in that no assumptions are made on $mathcalD'$.
arXiv Detail & Related papers (2024-04-02T23:34:39Z) - Agnostically Learning Multi-index Models with Queries [54.290489524576756]
We study the power of query access for the task of agnostic learning under the Gaussian distribution.
We show that query access gives significant runtime improvements over random examples for agnostically learning MIMs.
arXiv Detail & Related papers (2023-12-27T15:50:47Z) - Testable Learning with Distribution Shift [9.036777309376697]
We define a new model called testable learning with distribution shift.
We obtain provably efficient algorithms for certifying the performance of a classifier on a test distribution.
We give several positive results for learning concept classes such as halfspaces, intersections of halfspaces, and decision trees.
arXiv Detail & Related papers (2023-11-25T23:57:45Z) - Efficiently Learning One-Hidden-Layer ReLU Networks via Schur
Polynomials [50.90125395570797]
We study the problem of PAC learning a linear combination of $k$ ReLU activations under the standard Gaussian distribution on $mathbbRd$ with respect to the square loss.
Our main result is an efficient algorithm for this learning task with sample and computational complexity $(dk/epsilon)O(k)$, whereepsilon>0$ is the target accuracy.
arXiv Detail & Related papers (2023-07-24T14:37:22Z) - Multi-block-Single-probe Variance Reduced Estimator for Coupled
Compositional Optimization [49.58290066287418]
We propose a novel method named Multi-block-probe Variance Reduced (MSVR) to alleviate the complexity of compositional problems.
Our results improve upon prior ones in several aspects, including the order of sample complexities and dependence on strongity.
arXiv Detail & Related papers (2022-07-18T12:03:26Z) - Learning a Single Neuron with Adversarial Label Noise via Gradient
Descent [50.659479930171585]
We study a function of the form $mathbfxmapstosigma(mathbfwcdotmathbfx)$ for monotone activations.
The goal of the learner is to output a hypothesis vector $mathbfw$ that $F(mathbbw)=C, epsilon$ with high probability.
arXiv Detail & Related papers (2022-06-17T17:55:43Z) - Testing distributional assumptions of learning algorithms [5.204779946147061]
We study the design of tester-learner pairs $(mathcalA,mathcalT)$.
We show that if the distribution on examples in the data passes the tester $mathcalT$ then one can safely trust the output of the agnostic $mathcalA$ on the data.
arXiv Detail & Related papers (2022-04-14T19:10:53Z) - Tight Bounds on the Hardness of Learning Simple Nonparametric Mixtures [9.053430799456587]
We study the problem of learning nonparametric distributions in a finite mixture.
We establish tight bounds on the sample complexity for learning the component distributions in such models.
arXiv Detail & Related papers (2022-03-28T23:53:48Z) - TURF: A Two-factor, Universal, Robust, Fast Distribution Learning
Algorithm [64.13217062232874]
One of its most powerful and successful modalities approximates every distribution to an $ell$ distance essentially at most a constant times larger than its closest $t$-piece degree-$d_$.
We provide a method that estimates this number near-optimally, hence helps approach the best possible approximation.
arXiv Detail & Related papers (2022-02-15T03:49:28Z) - Optimal Testing of Discrete Distributions with High Probability [49.19942805582874]
We study the problem of testing discrete distributions with a focus on the high probability regime.
We provide the first algorithms for closeness and independence testing that are sample-optimal, within constant factors.
arXiv Detail & Related papers (2020-09-14T16:09:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.