Uniform Convergence Beyond Glivenko-Cantelli
- URL: http://arxiv.org/abs/2510.21506v1
- Date: Fri, 24 Oct 2025 14:33:22 GMT
- Title: Uniform Convergence Beyond Glivenko-Cantelli
- Authors: Tanmay Devale, Pramith Devulapalli, Steve Hanneke,
- Abstract summary: We introduce Uniform Mean Estimability, also called $UME-$ learnability, which captures when a collection permits uniform mean estimation by any arbitrary estimator.<n>We show that separability of the mean vectors is a sufficient condition for $UME-$ learnability.<n>We also establish that countable unions of $UME-$ learnable collections are also $UME-$ learnable.
- Score: 28.76184007708457
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We characterize conditions under which collections of distributions on $\{0,1\}^\mathbb{N}$ admit uniform estimation of their mean. Prior work from Vapnik and Chervonenkis (1971) has focused on uniform convergence using the empirical mean estimator, leading to the principle known as $P-$ Glivenko-Cantelli. We extend this framework by moving beyond the empirical mean estimator and introducing Uniform Mean Estimability, also called $UME-$ learnability, which captures when a collection permits uniform mean estimation by any arbitrary estimator. We work on the space created by the mean vectors of the collection of distributions. For each distribution, the mean vector records the expected value in each coordinate. We show that separability of the mean vectors is a sufficient condition for $UME-$ learnability. However, we show that separability of the mean vectors is not necessary for $UME-$ learnability by constructing a collection of distributions whose mean vectors are non-separable yet $UME-$ learnable using techniques fundamentally different from those used in our separability-based analysis. Finally, we establish that countable unions of $UME-$ learnable collections are also $UME-$ learnable, solving a conjecture posed in Cohen et al. (2025).
Related papers
- Anisotropic local law for non-separable sample covariance matrices [10.181748307494608]
We establish local laws for sample covariance matrices $K = N-1sum_i=1N g_ig_ig_i*$ where the random vectors $g_1, ldots, g_N in Rn$ are independent with common covariance $$.<n>We discuss several classes of non-separable examples satisfying our assumptions, including conditionally mean-zero distributions, the random features model $g = (Xw)$ arising in machine learning, and Gaussian measures with
arXiv Detail & Related papers (2026-02-20T03:28:51Z) - Ratio Covers of Convex Sets and Optimal Mixture Density Estimation [16.547739992791197]
We study density estimation in Kullback-Leibler divergence.<n>We identify the best possible high-probability guarantees in terms of the dictionary size, sample size, and confidence level.
arXiv Detail & Related papers (2026-02-18T02:25:18Z) - Almost Asymptotically Optimal Active Clustering Through Pairwise Observations [59.20614082241528]
We propose a new analysis framework for clustering $M$ items into an unknown number of $K$ distinct groups using noisy and actively collected responses.<n>We establish a fundamental lower bound on the expected number of queries needed to achieve a desired confidence in the accuracy of the clustering.<n>We develop a computationally feasible variant of the Generalized Likelihood Ratio statistic and show that its performance gap to the lower bound can be accurately empirically estimated.
arXiv Detail & Related papers (2026-02-05T14:16:47Z) - Stabilizing Fixed-Point Iteration for Markov Chain Poisson Equations [49.702772230127465]
We study finite-state Markov chains with $n$ states and transition matrix $P$.<n>We show that all non-decaying modes are captured by a real peripheral invariant subspace $mathcalK(P)$, and that the induced operator on the quotient space $mathbbRn/mathcalK(P) is strictly contractive, yielding a unique quotient solution.
arXiv Detail & Related papers (2026-01-31T02:57:01Z) - Near-Optimal Clustering in Mixture of Markov Chains [74.3828414695655]
We study the problem of clustering $T$ trajectories of length $H$, each generated by one of $K$ unknown ergodic Markov chains over a finite state space of size $S$.<n>We derive an instance-dependent, high-probability lower bound on the clustering error rate, governed by the weighted KL divergence between the transition kernels of the chains.<n>We then present a novel two-stage clustering algorithm.
arXiv Detail & Related papers (2025-06-02T05:10:40Z) - SoS Certifiability of Subgaussian Distributions and its Algorithmic Applications [37.208622097149714]
We prove that there is a universal constant $C>0$ so that for every $d inmathbb N$, every centered subgaussian distribution $mathcal D$ on $mathbb Rd$, and every even $p inmathbb N$, the $d-optimal inmathbb N$, and the $d-optimal inmathbb N$.
This establishes that every subgaussian distribution is emphS-certifiably subgaussian -- a condition that yields efficient learning algorithms for a wide variety of
arXiv Detail & Related papers (2024-10-28T16:36:58Z) - Distributional Reinforcement Learning with Dual Expectile-Quantile Regression [51.87411935256015]
quantile regression approach to distributional RL provides flexible and effective way of learning arbitrary return distributions.<n>We show that distributional estimation guarantees vanish, and we empirically observe that the estimated distribution rapidly collapses to its mean.<n>Motivated by the efficiency of $L$-based learning, we propose to jointly learn expectiles and quantiles of the return distribution in a way that allows efficient learning.
arXiv Detail & Related papers (2023-05-26T12:30:05Z) - Replicable Clustering [57.19013971737493]
We propose algorithms for the statistical $k$-medians, statistical $k$-means, and statistical $k$-centers problems by utilizing approximation routines for their counterparts in a black-box manner.
We also provide experiments on synthetic distributions in 2D using the $k$-means++ implementation from sklearn as a black-box that validate our theoretical results.
arXiv Detail & Related papers (2023-02-20T23:29:43Z) - On counterfactual inference with unobserved confounding [36.18241676876348]
Given an observational study with $n$ independent but heterogeneous units, our goal is to learn the counterfactual distribution for each unit.
We introduce a convex objective that pools all $n$ samples to jointly learn all $n$ parameter vectors.
We derive sufficient conditions for compactly supported distributions to satisfy the logarithmic Sobolev inequality.
arXiv Detail & Related papers (2022-11-14T04:14:37Z) - Robust Learning of Optimal Auctions [84.13356290199603]
We study the problem of learning revenue-optimal multi-bidder auctions from samples when the samples of bidders' valuations can be adversarially corrupted or drawn from distributions that are adversarially perturbed.
We propose new algorithms that can learn a mechanism whose revenue is nearly optimal simultaneously for all true distributions'' that are $alpha$-close to the original distribution in Kolmogorov-Smirnov distance.
arXiv Detail & Related papers (2021-07-13T17:37:21Z) - $(f,\Gamma)$-Divergences: Interpolating between $f$-Divergences and
Integral Probability Metrics [6.221019624345409]
We develop a framework for constructing information-theoretic divergences that subsume both $f$-divergences and integral probability metrics (IPMs)
We show that they can be expressed as a two-stage mass-redistribution/mass-transport process.
Using statistical learning as an example, we demonstrate their advantage in training generative adversarial networks (GANs) for heavy-tailed, not-absolutely continuous sample distributions.
arXiv Detail & Related papers (2020-11-11T18:17:09Z) - Robustly Learning any Clusterable Mixture of Gaussians [55.41573600814391]
We study the efficient learnability of high-dimensional Gaussian mixtures in the adversarial-robust setting.
We provide an algorithm that learns the components of an $epsilon$-corrupted $k$-mixture within information theoretically near-optimal error proofs of $tildeO(epsilon)$.
Our main technical contribution is a new robust identifiability proof clusters from a Gaussian mixture, which can be captured by the constant-degree Sum of Squares proof system.
arXiv Detail & Related papers (2020-05-13T16:44:12Z) - Robust $k$-means Clustering for Distributions with Two Moments [4.21934751979057]
We consider the robust algorithms for the $k$-means clustering problem where a quantizer is constructed based on $N$ independent observations.
Our main results are median of means based non-asymptotic excess distortion bounds that hold under the two bounded moments assumption in a general separable Hilbert space.
arXiv Detail & Related papers (2020-02-06T16:36:53Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.