Sharp finite-sample concentration of independent variables
- URL: http://arxiv.org/abs/2008.13293v5
- Date: Fri, 8 Oct 2021 21:16:56 GMT
- Title: Sharp finite-sample concentration of independent variables
- Authors: Akshay Balsubramani
- Abstract summary: We show an extension of Sanov's theorem on large deviations.
We control the tail probabilities of i.i.d. random variables with matching concentration and anti-concentration bounds.
- Score: 5.788221302433176
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We show an extension of Sanov's theorem on large deviations, controlling the
tail probabilities of i.i.d. random variables with matching concentration and
anti-concentration bounds. This result has a general scope, applies to samples
of any size, and has a short information-theoretic proof using elementary
techniques.
Related papers
- Convergence of Score-Based Discrete Diffusion Models: A Discrete-Time Analysis [56.442307356162864]
We study the theoretical aspects of score-based discrete diffusion models under the Continuous Time Markov Chain (CTMC) framework.
We introduce a discrete-time sampling algorithm in the general state space $[S]d$ that utilizes score estimators at predefined time points.
Our convergence analysis employs a Girsanov-based method and establishes key properties of the discrete score function.
arXiv Detail & Related papers (2024-10-03T09:07:13Z) - Unveiling the Statistical Foundations of Chain-of-Thought Prompting Methods [59.779795063072655]
Chain-of-Thought (CoT) prompting and its variants have gained popularity as effective methods for solving multi-step reasoning problems.
We analyze CoT prompting from a statistical estimation perspective, providing a comprehensive characterization of its sample complexity.
arXiv Detail & Related papers (2024-08-25T04:07:18Z) - Theoretical Analysis of Leave-one-out Cross Validation for
Non-differentiable Penalties under High-dimensional Settings [12.029919627622954]
We provide finite sample upper bounds on the expected squared error of leave-one-out cross-validation (LO) in estimating the out-of-sample risk.
The theoretical framework presented here provides a solid foundation for elucidating empirical findings that show the accuracy of LO.
arXiv Detail & Related papers (2024-02-13T15:48:10Z) - Anomaly Detection with Variance Stabilized Density Estimation [49.46356430493534]
We present a variance-stabilized density estimation problem for maximizing the likelihood of the observed samples.
To obtain a reliable anomaly detector, we introduce a spectral ensemble of autoregressive models for learning the variance-stabilized distribution.
We have conducted an extensive benchmark with 52 datasets, demonstrating that our method leads to state-of-the-art results.
arXiv Detail & Related papers (2023-06-01T11:52:58Z) - Classification of Heavy-tailed Features in High Dimensions: a
Superstatistical Approach [1.4469725791865984]
We characterise the learning of a mixture of two clouds of data points with generic centroids.
We study the generalisation performance of the obtained estimator, we analyse the role of regularisation, and we analytically the separability transition.
arXiv Detail & Related papers (2023-04-06T07:53:05Z) - Mean-Square Analysis of Discretized It\^o Diffusions for Heavy-tailed
Sampling [17.415391025051434]
We analyze the complexity of sampling from a class of heavy-tailed distributions by discretizing a natural class of Ito diffusions associated with weighted Poincar'e inequalities.
Based on a mean-square analysis, we establish the iteration complexity for obtaining a sample whose distribution is $epsilon$ close to the target distribution in the Wasserstein-2 metric.
arXiv Detail & Related papers (2023-03-01T15:16:03Z) - Efficient CDF Approximations for Normalizing Flows [64.60846767084877]
We build upon the diffeomorphic properties of normalizing flows to estimate the cumulative distribution function (CDF) over a closed region.
Our experiments on popular flow architectures and UCI datasets show a marked improvement in sample efficiency as compared to traditional estimators.
arXiv Detail & Related papers (2022-02-23T06:11:49Z) - Robust Estimation for Nonparametric Families via Generative Adversarial
Networks [92.64483100338724]
We provide a framework for designing Generative Adversarial Networks (GANs) to solve high dimensional robust statistics problems.
Our work extend these to robust mean estimation, second moment estimation, and robust linear regression.
In terms of techniques, our proposed GAN losses can be viewed as a smoothed and generalized Kolmogorov-Smirnov distance.
arXiv Detail & Related papers (2022-02-02T20:11:33Z) - Interpolation and Learning with Scale Dependent Kernels [91.41836461193488]
We study the learning properties of nonparametric ridge-less least squares.
We consider the common case of estimators defined by scale dependent kernels.
arXiv Detail & Related papers (2020-06-17T16:43:37Z) - Generic Error Bounds for the Generalized Lasso with Sub-Exponential Data [4.56877715768796]
This work performs a non-asymptotic analysis of the generalized Lasso under the assumption of sub-exponential data.
We show that the estimation error can be controlled by means of two complexity parameters that arise naturally from a generic-chaining-based proof strategy.
arXiv Detail & Related papers (2020-04-11T10:39:48Z) - Sharp Concentration Results for Heavy-Tailed Distributions [17.510560590853576]
We obtain concentration and large deviation for the sums of independent and identically distributed random variables with heavy-tailed distributions.
Our main theorem can not only recover some of the existing results, such as the concentration of the sum of subWeibull random variables, but it can also produce new results for the sum of random variables with heavier tails.
arXiv Detail & Related papers (2020-03-30T21:05:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.