On the Estimation of Information Measures of Continuous Distributions
- URL: http://arxiv.org/abs/2002.02851v3
- Date: Wed, 24 Nov 2021 22:41:02 GMT
- Title: On the Estimation of Information Measures of Continuous Distributions
- Authors: Georg Pichler and Pablo Piantanida and G\"unther Koliander
- Abstract summary: estimation of information measures of continuous distributions based on samples is a fundamental problem in statistics and machine learning.
We provide confidence bounds for simple histogram based estimation of differential entropy from a fixed number of samples.
Our focus is on differential entropy, but we provide examples that show that similar results hold for mutual information and relative entropy as well.
- Score: 25.395010130602287
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The estimation of information measures of continuous distributions based on
samples is a fundamental problem in statistics and machine learning. In this
paper, we analyze estimates of differential entropy in $K$-dimensional
Euclidean space, computed from a finite number of samples, when the probability
density function belongs to a predetermined convex family $\mathcal{P}$. First,
estimating differential entropy to any accuracy is shown to be infeasible if
the differential entropy of densities in $\mathcal{P}$ is unbounded, clearly
showing the necessity of additional assumptions. Subsequently, we investigate
sufficient conditions that enable confidence bounds for the estimation of
differential entropy. In particular, we provide confidence bounds for simple
histogram based estimation of differential entropy from a fixed number of
samples, assuming that the probability density function is Lipschitz continuous
with known Lipschitz constant and known, bounded support. Our focus is on
differential entropy, but we provide examples that show that similar results
hold for mutual information and relative entropy as well.
Related papers
- Convergence of Score-Based Discrete Diffusion Models: A Discrete-Time Analysis [56.442307356162864]
We study the theoretical aspects of score-based discrete diffusion models under the Continuous Time Markov Chain (CTMC) framework.
We introduce a discrete-time sampling algorithm in the general state space $[S]d$ that utilizes score estimators at predefined time points.
Our convergence analysis employs a Girsanov-based method and establishes key properties of the discrete score function.
arXiv Detail & Related papers (2024-10-03T09:07:13Z) - Non-asymptotic bounds for forward processes in denoising diffusions: Ornstein-Uhlenbeck is hard to beat [49.1574468325115]
This paper presents explicit non-asymptotic bounds on the forward diffusion error in total variation (TV)
We parametrise multi-modal data distributions in terms of the distance $R$ to their furthest modes and consider forward diffusions with additive and multiplicative noise.
arXiv Detail & Related papers (2024-08-25T10:28:31Z) - Convergence of Continuous Normalizing Flows for Learning Probability Distributions [10.381321024264484]
Continuous normalizing flows (CNFs) are a generative method for learning probability distributions.
We study the theoretical properties of CNFs with linear regularity in learning probability distributions from a finite random sample.
We present a convergence analysis framework that encompasses the error due to velocity estimation, the discretization error, and the early stopping error.
arXiv Detail & Related papers (2024-03-31T03:39:04Z) - Transformer-based Parameter Estimation in Statistics [0.0]
We propose a transformer-based approach to parameter estimation.
It does not even require knowing the probability density function, which is needed by numerical methods.
It is shown that our approach achieves similar or better accuracy as measured by mean-square-errors.
arXiv Detail & Related papers (2024-02-28T04:30:41Z) - On the Properties and Estimation of Pointwise Mutual Information Profiles [49.877314063833296]
The pointwise mutual information profile, or simply profile, is the distribution of pointwise mutual information for a given pair of random variables.
We introduce a novel family of distributions, Bend and Mix Models, for which the profile can be accurately estimated using Monte Carlo methods.
arXiv Detail & Related papers (2023-10-16T10:02:24Z) - Non-asymptotic approximations for Pearson's chi-square statistic and its
application to confidence intervals for strictly convex functions of the
probability weights of discrete distributions [0.0]
We develop a non-asymptotic local normal approximation for multinomial probabilities.
We apply our results to find confidence intervals for the negative entropy of discrete distributions.
arXiv Detail & Related papers (2023-09-05T01:18:48Z) - Statistical Efficiency of Score Matching: The View from Isoperimetry [96.65637602827942]
We show a tight connection between statistical efficiency of score matching and the isoperimetric properties of the distribution being estimated.
We formalize these results both in the sample regime and in the finite regime.
arXiv Detail & Related papers (2022-10-03T06:09:01Z) - Statistical Properties of the Entropy from Ordinal Patterns [55.551675080361335]
Knowing the joint distribution of the pair Entropy-Statistical Complexity for a large class of time series models would allow statistical tests that are unavailable to date.
We characterize the distribution of the empirical Shannon's Entropy for any model under which the true normalized Entropy is neither zero nor one.
We present a bilateral test that verifies if there is enough evidence to reject the hypothesis that two signals produce ordinal patterns with the same Shannon's Entropy.
arXiv Detail & Related papers (2022-09-15T23:55:58Z) - Profile Entropy: A Fundamental Measure for the Learnability and
Compressibility of Discrete Distributions [63.60499266361255]
We show that for samples of discrete distributions, profile entropy is a fundamental measure unifying the concepts of estimation, inference, and compression.
Specifically, profile entropy a) determines the speed of estimating the distribution relative to the best natural estimator; b) characterizes the rate of inferring all symmetric properties compared with the best estimator over any label-invariant distribution collection; c) serves as the limit of profile compression.
arXiv Detail & Related papers (2020-02-26T17:49:04Z) - Posterior Ratio Estimation of Latent Variables [14.619879849533662]
In some applications, we want to compare distributions of random variables that are emphinferred from observations.
We study the problem of estimating the ratio between two posterior probability density functions of a latent variable.
arXiv Detail & Related papers (2020-02-15T16:46:42Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.