Related papers: Approximating the Total Variation Distance between Gaussians

Approximating the Total Variation Distance between Gaussians

URL: http://arxiv.org/abs/2503.11099v1
Date: Fri, 14 Mar 2025 05:42:10 GMT
Title: Approximating the Total Variation Distance between Gaussians
Authors: Arnab Bhattacharyya, Weiming Feng, Piyush Srivastava,
Abstract summary: The total variation distance is a metric of central importance in statistics and probability theory.<n>We give algorithms that approximate the total variation distance between two $n$-dimensional Gaussians.<n>The main technical tool in our work is a reduction that helps us extend the recent progress on computing the TV-distance.
Score: 11.279878424960227
License: http://creativecommons.org/licenses/by/4.0/
Abstract: The total variation distance is a metric of central importance in statistics and probability theory. However, somewhat surprisingly, questions about computing it algorithmically appear not to have been systematically studied until very recently. In this paper, we contribute to this line of work by studying this question in the important special case of multivariate Gaussians. More formally, we consider the problem of approximating the total variation distance between two multivariate Gaussians to within an $\epsilon$-relative error. Previous works achieved a fixed constant relative error approximation via closed-form formulas. In this work, we give algorithms that given any two $n$-dimensional Gaussians $D_1,D_2$, and any error bound $\epsilon > 0$, approximate the total variation distance $D := d_{TV}(D_1,D_2)$ to $\epsilon$-relative accuracy in $\text{poly}(n,\frac{1}{\epsilon},\log \frac{1}{D})$ operations. The main technical tool in our work is a reduction that helps us extend the recent progress on computing the TV-distance between discrete random variables to our continuous setting.

Related papers

Entangled Mean Estimation in High-Dimensions [36.97113089188035]
We study the task of high-dimensional entangled mean estimation in the subset-of-signals model.<n>We show that the optimal error (up to polylogarithmic factors) is $f(alpha,N) + sqrtD/(alpha N)$, where the term $f(alpha,N)$ is the error of the one-dimensional problem and the second term is the sub-Gaussian error rate.
arXiv Detail & Related papers (2025-01-09T18:31:35Z)
Relative-Translation Invariant Wasserstein Distance [82.6068808353647]
We introduce a new family of distances, relative-translation invariant Wasserstein distances ($RW_p$) We show that $RW_p distances are also real distance metrics defined on the quotient set $mathcalP_p(mathbbRn)/sim$ invariant to distribution translations.
arXiv Detail & Related papers (2024-09-04T03:41:44Z)
Convergence of Unadjusted Langevin in High Dimensions: Delocalization of Bias [13.642712817536072]
We show that as the dimension $d$ of the problem increases, the number of iterations required to ensure convergence within a desired error increases. A key technical challenge we address is the lack of a one-step contraction property in the $W_2,ellinfty$ metric to measure convergence.
arXiv Detail & Related papers (2024-08-20T01:24:54Z)
Near-Optimal Mean Estimation with Unknown, Heteroskedastic Variances [15.990720051907864]
Subset-of-Signals model serves as a benchmark for heteroskedastic mean estimation. Our algorithm resolves this open question up to logarithmic factors. Even for $d=2$, our techniques enable rates comparable to knowing the variance of each sample.
arXiv Detail & Related papers (2023-12-05T01:13:10Z)
Optimal Approximation of Zonoids and Uniform Approximation by Shallow Neural Networks [2.7195102129095003]
We study the following two related problems. The first is to determine what error an arbitrary zonoid in $mathbbRd+1$ can be approximated in the Hausdorff distance by a sum of $n$ line segments. The second is to determine optimal approximation rates in the uniform norm for shallow ReLU$k$ neural networks on their variation spaces.
arXiv Detail & Related papers (2023-07-28T03:43:17Z)
Data Structures for Density Estimation [66.36971978162461]
Given a sublinear (in $n$) number of samples from $p$, our main result is the first data structure that identifies $v_i$ in time sublinear in $k$. We also give an improved version of the algorithm of Acharya et al. that reports $v_i$ in time linear in $k$.
arXiv Detail & Related papers (2023-06-20T06:13:56Z)
Beyond Moments: Robustly Learning Affine Transformations with Asymptotically Optimal Error [8.615625517708324]
We present a-time algorithm for learning an unknown affine transformation of the standard hypercube from samples. Our algorithm is based on a new method that iteratively improves an estimate of the unknown affine transformation whenever the requirements of the certificate are not met.
arXiv Detail & Related papers (2023-02-23T19:13:30Z)
A Law of Robustness beyond Isoperimetry [84.33752026418045]
We prove a Lipschitzness lower bound $Omega(sqrtn/p)$ of robustness of interpolating neural network parameters on arbitrary distributions. We then show the potential benefit of overparametrization for smooth data when $n=mathrmpoly(d)$. We disprove the potential existence of an $O(1)$-Lipschitz robust interpolating function when $n=exp(omega(d))$.
arXiv Detail & Related papers (2022-02-23T16:10:23Z)
Small Covers for Near-Zero Sets of Polynomials and Learning Latent Variable Models [56.98280399449707]
We show that there exists an $epsilon$-cover for $S$ of cardinality $M = (k/epsilon)O_d(k1/d)$. Building on our structural result, we obtain significantly improved learning algorithms for several fundamental high-dimensional probabilistic models hidden variables.
arXiv Detail & Related papers (2020-12-14T18:14:08Z)
Optimal Robust Linear Regression in Nearly Linear Time [97.11565882347772]
We study the problem of high-dimensional robust linear regression where a learner is given access to $n$ samples from the generative model $Y = langle X,w* rangle + epsilon$ We propose estimators for this problem under two settings: (i) $X$ is L4-L2 hypercontractive, $mathbbE [XXtop]$ has bounded condition number and $epsilon$ has bounded variance and (ii) $X$ is sub-Gaussian with identity second moment and $epsilon$ is
arXiv Detail & Related papers (2020-07-16T06:44:44Z)
Robustly Learning any Clusterable Mixture of Gaussians [55.41573600814391]
We study the efficient learnability of high-dimensional Gaussian mixtures in the adversarial-robust setting. We provide an algorithm that learns the components of an $epsilon$-corrupted $k$-mixture within information theoretically near-optimal error proofs of $tildeO(epsilon)$. Our main technical contribution is a new robust identifiability proof clusters from a Gaussian mixture, which can be captured by the constant-degree Sum of Squares proof system.
arXiv Detail & Related papers (2020-05-13T16:44:12Z)

This list is automatically generated from the titles and abstracts of the papers in this site.