The Price of Tolerance in Distribution Testing
- URL: http://arxiv.org/abs/2106.13414v1
- Date: Fri, 25 Jun 2021 03:59:42 GMT
- Title: The Price of Tolerance in Distribution Testing
- Authors: Cl\'ement L. Canonne, Ayush Jain, Gautam Kamath, Jerry Li
- Abstract summary: We show the sample complexity to be [fracsqrtnvarepsilon2 + fracnlog n cdotmaxleftfracvarepsilon2, providing a smooth tradeoff between the two previously known cases.
We also provide a similar characterization for the problem of tolerant equivalence testing, where both $p$ and $q$ are unknown.
- Score: 31.10049510641336
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We revisit the problem of tolerant distribution testing. That is, given
samples from an unknown distribution $p$ over $\{1, \dots, n\}$, is it
$\varepsilon_1$-close to or $\varepsilon_2$-far from a reference distribution
$q$ (in total variation distance)? Despite significant interest over the past
decade, this problem is well understood only in the extreme cases. In the
noiseless setting (i.e., $\varepsilon_1 = 0$) the sample complexity is
$\Theta(\sqrt{n})$, strongly sublinear in the domain size. At the other end of
the spectrum, when $\varepsilon_1 = \varepsilon_2/2$, the sample complexity
jumps to the barely sublinear $\Theta(n/\log n)$. However, very little is known
about the intermediate regime. We fully characterize the price of tolerance in
distribution testing as a function of $n$, $\varepsilon_1$, $\varepsilon_2$, up
to a single $\log n$ factor. Specifically, we show the sample complexity to be
\[\tilde \Theta\left(\frac{\sqrt{n}}{\varepsilon_2^{2}} + \frac{n}{\log n}
\cdot \max
\left\{\frac{\varepsilon_1}{\varepsilon_2^2},\left(\frac{\varepsilon_1}{\varepsilon_2^2}\right)^{\!\!2}\right\}\right),\]
providing a smooth tradeoff between the two previously known cases. We also
provide a similar characterization for the problem of tolerant equivalence
testing, where both $p$ and $q$ are unknown. Surprisingly, in both cases, the
main quantity dictating the sample complexity is the ratio
$\varepsilon_1/\varepsilon_2^2$, and not the more intuitive
$\varepsilon_1/\varepsilon_2$. Of particular technical interest is our lower
bound framework, which involves novel approximation-theoretic tools required to
handle the asymmetry between $\varepsilon_1$ and $\varepsilon_2$, a challenge
absent from previous works.
Related papers
- Almost Minimax Optimal Best Arm Identification in Piecewise Stationary Linear Bandits [55.957560311008926]
We propose a piecewise stationary linear bandit (PSLB) model where the quality of an arm is measured by its return averaged over all contexts.
PS$varepsilon$BAI$+$ is guaranteed to identify an $varepsilon$-optimal arm with probability $ge 1-delta$ and with a minimal number of samples.
arXiv Detail & Related papers (2024-10-10T06:15:42Z) - Coresets for Multiple $\ell_p$ Regression [47.790126028106734]
We construct coresets of size $tilde O(varepsilon-2d)$ for $p2$ and $tilde O(varepsilon-pdp/2)$ for $p>2$.
For $1p2$, every matrix has a subset of $tilde O(varepsilon-1k)$ rows which spans a $(varepsilon-1k)$-approximately optimal $k$-dimensional subspace for $ell_p$ subspace approximation
arXiv Detail & Related papers (2024-06-04T15:50:42Z) - Optimal bounds for $\ell_p$ sensitivity sampling via $\ell_2$ augmentation [56.403488578703865]
We show that by augmenting the $ell_p$ sensitivities by $ell$ sensitivities, we obtain better bounds on optimal linear $tilde O(varepsilon-2mu2 d)$ sampling complexity.
We also obtain an $tilde O(varepsilon-2mu2 d)$ sensitivity sampling bound for logistic regression.
arXiv Detail & Related papers (2024-06-01T07:03:40Z) - Near-Optimal Distributed Minimax Optimization under the Second-Order Similarity [22.615156512223763]
We propose variance- optimistic sliding (SVOGS) method, which takes the advantage of the finite-sum structure in the objective.
We prove $mathcal O(delta D2/varepsilon)$, communication complexity of $mathcal O(n+sqrtndelta D2/varepsilon)$, and local calls of $tildemathcal O(n+sqrtndelta+L)D2/varepsilon)$.
arXiv Detail & Related papers (2024-05-25T08:34:49Z) - Estimation of Entropy in Constant Space with Improved Sample Complexity [14.718968517824756]
We give a new constant memory scheme that reduces the sample complexity to $(k/epsilon2)cdot textpolylog (1/epsilon)$.
We conjecture that this is optimal up to $textpolylog (1/epsilon)$ factors.
arXiv Detail & Related papers (2022-05-19T18:51:28Z) - Tight Bounds on the Hardness of Learning Simple Nonparametric Mixtures [9.053430799456587]
We study the problem of learning nonparametric distributions in a finite mixture.
We establish tight bounds on the sample complexity for learning the component distributions in such models.
arXiv Detail & Related papers (2022-03-28T23:53:48Z) - Threshold Phenomena in Learning Halfspaces with Massart Noise [56.01192577666607]
We study the problem of PAC learning halfspaces on $mathbbRd$ with Massart noise under Gaussian marginals.
Our results qualitatively characterize the complexity of learning halfspaces in the Massart model.
arXiv Detail & Related papers (2021-08-19T16:16:48Z) - Self-training Converts Weak Learners to Strong Learners in Mixture
Models [86.7137362125503]
We show that a pseudolabeler $boldsymbolbeta_mathrmpl$ can achieve classification error at most $C_mathrmerr$.
We additionally show that by running gradient descent on the logistic loss one can obtain a pseudolabeler $boldsymbolbeta_mathrmpl$ with classification error $C_mathrmerr$ using only $O(d)$ labeled examples.
arXiv Detail & Related papers (2021-06-25T17:59:16Z) - Infinite-Horizon Offline Reinforcement Learning with Linear Function
Approximation: Curse of Dimensionality and Algorithm [46.36534144138337]
In this paper, we investigate the sample complexity of policy evaluation in offline reinforcement learning.
Under the low distribution shift assumption, we show that there is an algorithm that needs at most $Oleft(maxleft fracleftVert thetapirightVert _24varepsilon4logfracddelta,frac1varepsilon2left(d+logfrac1deltaright)right right)$ samples to approximate the
arXiv Detail & Related papers (2021-03-17T18:18:57Z) - Model-Free Reinforcement Learning: from Clipped Pseudo-Regret to Sample
Complexity [59.34067736545355]
Given an MDP with $S$ states, $A$ actions, the discount factor $gamma in (0,1)$, and an approximation threshold $epsilon > 0$, we provide a model-free algorithm to learn an $epsilon$-optimal policy.
For small enough $epsilon$, we show an improved algorithm with sample complexity.
arXiv Detail & Related papers (2020-06-06T13:34:41Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.