Characterizing the Distinguishability of Product Distributions through Multicalibration
- URL: http://arxiv.org/abs/2412.03562v1
- Date: Wed, 04 Dec 2024 18:56:19 GMT
- Title: Characterizing the Distinguishability of Product Distributions through Multicalibration
- Authors: Cassandra Marcussen, Aaron L. Putterman, Salil Vadhan,
- Abstract summary: We prove a new, tight characterization of the number of samples $k$ needed to efficiently distinguish $X_0otimes k$ and $X_1otimes k$.
Our framework can be used to re-derive a result of Geier (TCC 2022), proving nearly-tight bounds on how computational inability scales with the number of samples.
- Score: 9.695176684285832
- License:
- Abstract: Given a sequence of samples $x_1, \dots , x_k$ promised to be drawn from one of two distributions $X_0, X_1$, a well-studied problem in statistics is to decide $\textit{which}$ distribution the samples are from. Information theoretically, the maximum advantage in distinguishing the two distributions given $k$ samples is captured by the total variation distance between $X_0^{\otimes k}$ and $X_1^{\otimes k}$. However, when we restrict our attention to $\textit{efficient distinguishers}$ (i.e., small circuits) of these two distributions, exactly characterizing the ability to distinguish $X_0^{\otimes k}$ and $X_1^{\otimes k}$ is more involved and less understood. In this work, we give a general way to reduce bounds on the computational indistinguishability of $X_0$ and $X_1$ to bounds on the $\textit{information-theoretic}$ indistinguishability of some specific, related variables $\widetilde{X}_0$ and $\widetilde{X}_1$. As a consequence, we prove a new, tight characterization of the number of samples $k$ needed to efficiently distinguish $X_0^{\otimes k}$ and $X_1^{\otimes k}$ with constant advantage as \[ k = \Theta\left(d_H^{-2}\left(\widetilde{X}_0, \widetilde{X}_1\right)\right), \] which is the inverse of the squared Hellinger distance $d_H$ between two distributions $\widetilde{X}_0$ and $\widetilde{X}_1$ that are computationally indistinguishable from $X_0$ and $X_1$. Likewise, our framework can be used to re-derive a result of Geier (TCC 2022), proving nearly-tight bounds on how computational indistinguishability scales with the number of samples for arbitrary product distributions.
Related papers
- Dimension-free Private Mean Estimation for Anisotropic Distributions [55.86374912608193]
Previous private estimators on distributions over $mathRd suffer from a curse of dimensionality.
We present an algorithm whose sample complexity has improved dependence on dimension.
arXiv Detail & Related papers (2024-11-01T17:59:53Z) - Outlier Robust Multivariate Polynomial Regression [27.03423421704806]
We are given a set of random samples $(mathbfx_i,y_i) in [-1,1]n times mathbbR$ that are noisy versions of $(mathbfx_i,p(mathbfx_i)$.
The goal is to output a $hatp$, within an $ell_in$-distance of at most $O(sigma)$ from $p$.
arXiv Detail & Related papers (2024-03-14T15:04:45Z) - Testing Closeness of Multivariate Distributions via Ramsey Theory [40.926523210945064]
We investigate the statistical task of closeness (or equivalence) testing for multidimensional distributions.
Specifically, given sample access to two unknown distributions $mathbf p, mathbf q$ on $mathbb Rd$, we want to distinguish between the case that $mathbf p=mathbf q$ versus $|mathbf p-mathbf q|_A_k > epsilon$.
Our main result is the first closeness tester for this problem with em sub-learning sample complexity in any fixed dimension.
arXiv Detail & Related papers (2023-11-22T04:34:09Z) - Identification of Mixtures of Discrete Product Distributions in
Near-Optimal Sample and Time Complexity [6.812247730094931]
We show, for any $ngeq 2k-1$, how to achieve sample complexity and run-time complexity $(1/zeta)O(k)$.
We also extend the known lower bound of $eOmega(k)$ to match our upper bound across a broad range of $zeta$.
arXiv Detail & Related papers (2023-09-25T09:50:15Z) - TURF: A Two-factor, Universal, Robust, Fast Distribution Learning
Algorithm [64.13217062232874]
One of its most powerful and successful modalities approximates every distribution to an $ell$ distance essentially at most a constant times larger than its closest $t$-piece degree-$d_$.
We provide a method that estimates this number near-optimally, hence helps approach the best possible approximation.
arXiv Detail & Related papers (2022-02-15T03:49:28Z) - Random matrices in service of ML footprint: ternary random features with
no performance loss [55.30329197651178]
We show that the eigenspectrum of $bf K$ is independent of the distribution of the i.i.d. entries of $bf w$.
We propose a novel random technique, called Ternary Random Feature (TRF)
The computation of the proposed random features requires no multiplication and a factor of $b$ less bits for storage compared to classical random features.
arXiv Detail & Related papers (2021-10-05T09:33:49Z) - Self-training Converts Weak Learners to Strong Learners in Mixture
Models [86.7137362125503]
We show that a pseudolabeler $boldsymbolbeta_mathrmpl$ can achieve classification error at most $C_mathrmerr$.
We additionally show that by running gradient descent on the logistic loss one can obtain a pseudolabeler $boldsymbolbeta_mathrmpl$ with classification error $C_mathrmerr$ using only $O(d)$ labeled examples.
arXiv Detail & Related papers (2021-06-25T17:59:16Z) - The Sparse Hausdorff Moment Problem, with Application to Topic Models [5.151973524974052]
We give an algorithm for identifying a $k$-mixture using samples of $m=2k$ iid binary random variables.
It suffices to know the moments to additive accuracy $w_mincdotzetaO(k)$.
arXiv Detail & Related papers (2020-07-16T04:23:57Z) - Model-Free Reinforcement Learning: from Clipped Pseudo-Regret to Sample
Complexity [59.34067736545355]
Given an MDP with $S$ states, $A$ actions, the discount factor $gamma in (0,1)$, and an approximation threshold $epsilon > 0$, we provide a model-free algorithm to learn an $epsilon$-optimal policy.
For small enough $epsilon$, we show an improved algorithm with sample complexity.
arXiv Detail & Related papers (2020-06-06T13:34:41Z) - Curse of Dimensionality on Randomized Smoothing for Certifiable
Robustness [151.67113334248464]
We show that extending the smoothing technique to defend against other attack models can be challenging.
We present experimental results on CIFAR to validate our theory.
arXiv Detail & Related papers (2020-02-08T22:02:14Z) - Sample Amplification: Increasing Dataset Size even when Learning is Impossible [15.864702679819544]
Given data drawn from an unknown distribution, $D$, to what extent is it possible to amplify'' this dataset and output an even larger set of samples that appear to have been drawn from $D$?
We formalize this question as follows: an $left(n, n + Theta(fracnsqrtk)right)$ amplifier exists, even though learning the distribution to small constant total variation distance requires $Theta(d)$ samples.
arXiv Detail & Related papers (2019-04-26T21:42:44Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.