Non-asymptotic approximations for Pearson's chi-square statistic and its
application to confidence intervals for strictly convex functions of the
probability weights of discrete distributions
- URL: http://arxiv.org/abs/2309.01882v1
- Date: Tue, 5 Sep 2023 01:18:48 GMT
- Title: Non-asymptotic approximations for Pearson's chi-square statistic and its
application to confidence intervals for strictly convex functions of the
probability weights of discrete distributions
- Authors: Eric Bax and Fr\'ed\'eric Ouimet
- Abstract summary: We develop a non-asymptotic local normal approximation for multinomial probabilities.
We apply our results to find confidence intervals for the negative entropy of discrete distributions.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this paper, we develop a non-asymptotic local normal approximation for
multinomial probabilities. First, we use it to find non-asymptotic total
variation bounds between the measures induced by uniformly jittered
multinomials and the multivariate normals with the same means and covariances.
From the total variation bounds, we also derive a comparison of the cumulative
distribution functions and quantile coupling inequalities between Pearson's
chi-square statistic (written as the normalized quadratic form of a multinomial
vector) and its multivariate normal analogue. We apply our results to find
confidence intervals for the negative entropy of discrete distributions. Our
method can be applied more generally to find confidence intervals for strictly
convex functions of the weights of discrete distributions.
Related papers
- Transformer-based Parameter Estimation in Statistics [0.0]
We propose a transformer-based approach to parameter estimation.
It does not even require knowing the probability density function, which is needed by numerical methods.
It is shown that our approach achieves similar or better accuracy as measured by mean-square-errors.
arXiv Detail & Related papers (2024-02-28T04:30:41Z) - Statistical Efficiency of Score Matching: The View from Isoperimetry [96.65637602827942]
We show a tight connection between statistical efficiency of score matching and the isoperimetric properties of the distribution being estimated.
We formalize these results both in the sample regime and in the finite regime.
arXiv Detail & Related papers (2022-10-03T06:09:01Z) - Statistical Properties of the Entropy from Ordinal Patterns [55.551675080361335]
Knowing the joint distribution of the pair Entropy-Statistical Complexity for a large class of time series models would allow statistical tests that are unavailable to date.
We characterize the distribution of the empirical Shannon's Entropy for any model under which the true normalized Entropy is neither zero nor one.
We present a bilateral test that verifies if there is enough evidence to reject the hypothesis that two signals produce ordinal patterns with the same Shannon's Entropy.
arXiv Detail & Related papers (2022-09-15T23:55:58Z) - Joint Probability Estimation Using Tensor Decomposition and Dictionaries [3.4720326275851994]
We study non-parametric estimation of joint probabilities of a given set of discrete and continuous random variables from their (empirically estimated) 2D marginals.
We create a dictionary of various families of distributions by inspecting the data, and use it to approximate each decomposed factor of the product in the mixture.
arXiv Detail & Related papers (2022-03-03T11:55:51Z) - Theoretical Error Analysis of Entropy Approximation for Gaussian Mixture [0.7499722271664147]
In this paper, we analyze the approximation error between the true entropy and the approximate one to reveal when this approximation works effectively.
Our results provide a guarantee that this approximation works well in higher dimension problems.
arXiv Detail & Related papers (2022-02-26T04:49:01Z) - Efficient CDF Approximations for Normalizing Flows [64.60846767084877]
We build upon the diffeomorphic properties of normalizing flows to estimate the cumulative distribution function (CDF) over a closed region.
Our experiments on popular flow architectures and UCI datasets show a marked improvement in sample efficiency as compared to traditional estimators.
arXiv Detail & Related papers (2022-02-23T06:11:49Z) - A Unified Framework for Multi-distribution Density Ratio Estimation [101.67420298343512]
Binary density ratio estimation (DRE) provides the foundation for many state-of-the-art machine learning algorithms.
We develop a general framework from the perspective of Bregman minimization divergence.
We show that our framework leads to methods that strictly generalize their counterparts in binary DRE.
arXiv Detail & Related papers (2021-12-07T01:23:20Z) - A Stochastic Newton Algorithm for Distributed Convex Optimization [62.20732134991661]
We analyze a Newton algorithm for homogeneous distributed convex optimization, where each machine can calculate gradients of the same population objective.
We show that our method can reduce the number, and frequency, of required communication rounds compared to existing methods without hurting performance.
arXiv Detail & Related papers (2021-10-07T17:51:10Z) - Characterizations of non-normalized discrete probability distributions
and their application in statistics [0.0]
We derive explicit formulae for the mass functions of discrete probability laws that identify those distributions.
Our characterizations, and hence the applications built on them, do not require any knowledge about normalization constants of the probability laws.
arXiv Detail & Related papers (2020-11-09T12:08:12Z) - Minimax Optimal Estimation of KL Divergence for Continuous Distributions [56.29748742084386]
Esting Kullback-Leibler divergence from identical and independently distributed samples is an important problem in various domains.
One simple and effective estimator is based on the k nearest neighbor between these samples.
arXiv Detail & Related papers (2020-02-26T16:37:37Z) - On the Estimation of Information Measures of Continuous Distributions [25.395010130602287]
estimation of information measures of continuous distributions based on samples is a fundamental problem in statistics and machine learning.
We provide confidence bounds for simple histogram based estimation of differential entropy from a fixed number of samples.
Our focus is on differential entropy, but we provide examples that show that similar results hold for mutual information and relative entropy as well.
arXiv Detail & Related papers (2020-02-07T15:36:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.