Related papers: Sharp Inequalities between Total Variation and Hellinger Distances for Gaussian Mixtures

Sharp Inequalities between Total Variation and Hellinger Distances for Gaussian Mixtures

URL: http://arxiv.org/abs/2602.03202v1
Date: Tue, 03 Feb 2026 07:14:12 GMT
Title: Sharp Inequalities between Total Variation and Hellinger Distances for Gaussian Mixtures
Authors: Joonhyuk Jung, Chao Gao,
Abstract summary: We study the relation between the total variation (TV) and Hellinger distances between two Gaussian location mixtures.<n>Our results resolve an open problem raised in Jia et al. (2023) and thus lead to an entropic characterization of learning Gaussian mixtures in total variation.
Score: 28.526133854008478
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We study the relation between the total variation (TV) and Hellinger distances between two Gaussian location mixtures. Our first result establishes a general upper bound: for any two mixing distributions supported on a compact set, the Hellinger distance between the two mixtures is controlled by the TV distance raised to a power $1-o(1)$, where the $o(1)$ term is of order $1/\log\log(1/\mathrm{TV})$. We also construct two sequences of mixing distributions that demonstrate the sharpness of this bound. Taken together, our results resolve an open problem raised in Jia et al. (2023) and thus lead to an entropic characterization of learning Gaussian mixtures in total variation. Our inequality also yields optimal robust estimation of Gaussian mixtures in Hellinger distance, which has a direct implication for bounding the minimax regret of empirical Bayes under Huber contamination.

Related papers

Learning general Gaussian mixtures with efficient score matching [16.06356123715737]
We study the problem of learning mixtures of $k$ Gaussians in $d$ dimensions. We make no separation assumptions on the underlying mixture components. We give an algorithm that draws $dmathrmpoly(k/varepsilon)$ samples from the target mixture, runs in sample-polynomial time, and constructs a sampler.
arXiv Detail & Related papers (2024-04-29T17:30:36Z)
Mixtures of Gaussians are Privately Learnable with a Polynomial Number of Samples [9.649879910148854]
We study the problem of estimating mixtures of Gaussians under the constraint of differential privacy (DP) Our main result is that $textpoly(k,d,1/alpha,1/varepsilon,log (1/delta))$ samples are sufficient to estimate a mixture of $k$ Gaussians in $mathbbRd$ up to total variation distance $alpha$. This is the first finite sample complexity upper bound for the problem that does not make any structural assumptions on the GMMs.
arXiv Detail & Related papers (2023-09-07T17:02:32Z)
The Parametric Stability of Well-separated Spherical Gaussian Mixtures [7.238973585403367]
We quantify the parameter stability of a spherical Gaussian Mixture Model (sGMM) under small perturbations in distribution space. We derive the first explicit bound to show that for a mixture of spherical Gaussian $P$ (sGMM) in a pre-defined model class, all other sGMM close to $P in this model class in total variation distance has a small parameter distance to $P.
arXiv Detail & Related papers (2023-02-01T04:52:13Z)
Theoretical Error Analysis of Entropy Approximation for Gaussian Mixtures [0.6990493129893112]
In this paper, we study the approximate entropy represented as the sum of the entropies of unimodal Gaussian distributions with mixing coefficients.<n>We theoretically analyze the approximation error between the true and the approximate entropy to reveal when this approximation works effectively.<n>Our results provide a guarantee that this approximation works well for high-dimensional problems, such as neural networks.
arXiv Detail & Related papers (2022-02-26T04:49:01Z)
A Unified Framework for Multi-distribution Density Ratio Estimation [101.67420298343512]
Binary density ratio estimation (DRE) provides the foundation for many state-of-the-art machine learning algorithms. We develop a general framework from the perspective of Bregman minimization divergence. We show that our framework leads to methods that strictly generalize their counterparts in binary DRE.
arXiv Detail & Related papers (2021-12-07T01:23:20Z)
Lower Bounds on the Total Variation Distance Between Mixtures of Two Gaussians [45.392805695921666]
We exploit a connection between total variation distance and the characteristic function of the mixture. We derive new lower bounds on the total variation distance between pairs of two-component Gaussian mixtures.
arXiv Detail & Related papers (2021-09-02T16:32:16Z)
Learning Gaussian Mixtures with Generalised Linear Models: Precise Asymptotics in High-dimensions [79.35722941720734]
Generalised linear models for multi-class classification problems are one of the fundamental building blocks of modern machine learning tasks. We prove exacts characterising the estimator in high-dimensions via empirical risk minimisation. We discuss how our theory can be applied beyond the scope of synthetic data.
arXiv Detail & Related papers (2021-06-07T16:53:56Z)
Robustly Learning Mixtures of $k$ Arbitrary Gaussians [47.40835932474677]
We give a-time algorithm for the problem of robustly estimating a mixture of $k$ arbitrary Gaussians in $mathbbRd$, for any fixed $k$, in the presence of a constant fraction of arbitrary corruptions. Our main tools are an efficient emphpartial clustering algorithm that relies on the sum-of-squares method, and a novel tensor decomposition algorithm that allows errors in both Frobenius norm and low-rank terms.
arXiv Detail & Related papers (2020-12-03T17:54:03Z)
Robustly Learning any Clusterable Mixture of Gaussians [55.41573600814391]
We study the efficient learnability of high-dimensional Gaussian mixtures in the adversarial-robust setting. We provide an algorithm that learns the components of an $epsilon$-corrupted $k$-mixture within information theoretically near-optimal error proofs of $tildeO(epsilon)$. Our main technical contribution is a new robust identifiability proof clusters from a Gaussian mixture, which can be captured by the constant-degree Sum of Squares proof system.
arXiv Detail & Related papers (2020-05-13T16:44:12Z)
Outlier-Robust Clustering of Non-Spherical Mixtures [5.863264019032882]
We give the first outlier-robust efficient algorithm for clustering a mixture of $k$ statistically separated d-dimensional Gaussians (k-GMMs) Our results extend to clustering mixtures of arbitrary affine transforms of the uniform distribution on the $d$-dimensional unit sphere.
arXiv Detail & Related papers (2020-05-06T17:24:27Z)
Algebraic and Analytic Approaches for Parameter Learning in Mixture Models [66.96778152993858]
We present two different approaches for parameter learning in several mixture models in one dimension. For some of these distributions, our results represent the first guarantees for parameter estimation.
arXiv Detail & Related papers (2020-01-19T05:10:56Z)

This list is automatically generated from the titles and abstracts of the papers in this site.