Independence Testing for Bounded Degree Bayesian Network
- URL: http://arxiv.org/abs/2204.08690v1
- Date: Tue, 19 Apr 2022 06:16:14 GMT
- Title: Independence Testing for Bounded Degree Bayesian Network
- Authors: Arnab Bhattacharyya, Cl\'ement L. Canonne, and Joy Qiping Yang
- Abstract summary: We show that if $P$ has a sparse structure, then in fact only linearly many samples are required.
We also show that if $P$ is Markov with respect to a Bayesian network whose underlying DAG has in-degree bounded by $d$, then $tildeTheta (2d/2cdot n/varepsilon2)$ samples are necessary.
- Score: 4.230271396864461
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We study the following independence testing problem: given access to samples
from a distribution $P$ over $\{0,1\}^n$, decide whether $P$ is a product
distribution or whether it is $\varepsilon$-far in total variation distance
from any product distribution. For arbitrary distributions, this problem
requires $\exp(n)$ samples. We show in this work that if $P$ has a sparse
structure, then in fact only linearly many samples are required. Specifically,
if $P$ is Markov with respect to a Bayesian network whose underlying DAG has
in-degree bounded by $d$, then $\tilde{\Theta}(2^{d/2}\cdot n/\varepsilon^2)$
samples are necessary and sufficient for independence testing.
Related papers
- Outlier Robust Multivariate Polynomial Regression [27.03423421704806]
We are given a set of random samples $(mathbfx_i,y_i) in [-1,1]n times mathbbR$ that are noisy versions of $(mathbfx_i,p(mathbfx_i)$.
The goal is to output a $hatp$, within an $ell_in$-distance of at most $O(sigma)$ from $p$.
arXiv Detail & Related papers (2024-03-14T15:04:45Z) - Testing with Non-identically Distributed Samples [20.74768558932617]
We examine the extent to which sublinear-sample property testing and estimation applies to settings where samples are independently but not identically distributed.
Even with $Theta(k/varepsilon2)$ samples from each distribution, $textbfp_mathrmavg$ is sufficient to learn $textbfp_mathrmavg$ to within error $varepsilon$ in TV distance.
arXiv Detail & Related papers (2023-11-19T01:25:50Z) - Efficient Sampling of Stochastic Differential Equations with Positive
Semi-Definite Models [91.22420505636006]
This paper deals with the problem of efficient sampling from a differential equation, given the drift function and the diffusion matrix.
It is possible to obtain independent and identically distributed (i.i.d.) samples at precision $varepsilon$ with a cost that is $m2 d log (1/varepsilon)$
Our results suggest that as the true solution gets smoother, we can circumvent the curse of dimensionality without requiring any sort of convexity.
arXiv Detail & Related papers (2023-03-30T02:50:49Z) - Stochastic Approximation Approaches to Group Distributionally Robust
Optimization [96.26317627118912]
Group distributionally robust optimization (GDRO)
Online learning techniques to reduce the number of samples required in each round from $m$ to $1$, keeping the same sample.
A novel formulation of weighted GDRO, which allows us to derive distribution-dependent convergence rates.
arXiv Detail & Related papers (2023-02-18T09:24:15Z) - Near-Optimal Bounds for Testing Histogram Distributions [35.18069719489173]
We show that the histogram testing problem has sample complexity $widetilde Theta (sqrtnk / varepsilon + k / varepsilon2 + sqrtn / varepsilon2)$.
arXiv Detail & Related papers (2022-07-14T01:24:01Z) - Nonasymptotic one-and two-sample tests in high dimension with unknown
covariance structure [0.0]
We consider the problem of testing if $mu is $eta-close to zero, i.e. $|mu| leq eta against $|mu| geq (eta + delta)$.
The aim of this paper is to obtain nonasymptotic upper and lower bounds on the minimal separation distancedelta$ such that we can control both the Type I and Type II errors at a given level.
arXiv Detail & Related papers (2021-09-01T06:22:53Z) - The Price of Tolerance in Distribution Testing [31.10049510641336]
We show the sample complexity to be [fracsqrtnvarepsilon2 + fracnlog n cdotmaxleftfracvarepsilon2, providing a smooth tradeoff between the two previously known cases.
We also provide a similar characterization for the problem of tolerant equivalence testing, where both $p$ and $q$ are unknown.
arXiv Detail & Related papers (2021-06-25T03:59:42Z) - The Sample Complexity of Robust Covariance Testing [56.98280399449707]
We are given i.i.d. samples from a distribution of the form $Z = (1-epsilon) X + epsilon B$, where $X$ is a zero-mean and unknown covariance Gaussian $mathcalN(0, Sigma)$.
In the absence of contamination, prior work gave a simple tester for this hypothesis testing task that uses $O(d)$ samples.
We prove a sample complexity lower bound of $Omega(d2)$ for $epsilon$ an arbitrarily small constant and $gamma
arXiv Detail & Related papers (2020-12-31T18:24:41Z) - Optimal Testing of Discrete Distributions with High Probability [49.19942805582874]
We study the problem of testing discrete distributions with a focus on the high probability regime.
We provide the first algorithms for closeness and independence testing that are sample-optimal, within constant factors.
arXiv Detail & Related papers (2020-09-14T16:09:17Z) - Locally Private Hypothesis Selection [96.06118559817057]
We output a distribution from $mathcalQ$ whose total variation distance to $p$ is comparable to the best such distribution.
We show that the constraint of local differential privacy incurs an exponential increase in cost.
Our algorithms result in exponential improvements on the round complexity of previous methods.
arXiv Detail & Related papers (2020-02-21T18:30:48Z) - Curse of Dimensionality on Randomized Smoothing for Certifiable
Robustness [151.67113334248464]
We show that extending the smoothing technique to defend against other attack models can be challenging.
We present experimental results on CIFAR to validate our theory.
arXiv Detail & Related papers (2020-02-08T22:02:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.