Local Intrinsic Dimensional Entropy
- URL: http://arxiv.org/abs/2304.02223v3
- Date: Wed, 24 May 2023 09:43:14 GMT
- Title: Local Intrinsic Dimensional Entropy
- Authors: Rohan Ghosh, Mehul Motani
- Abstract summary: Most entropy measures depend on the spread of the probability distribution over the sample space $mathcalX|$.
In this work, we question the role of cardinality and distribution spread in defining entropy measures for continuous spaces.
We find that the average value of the local intrinsic dimension of a distribution, denoted as ID-Entropy, can serve as a robust entropy measure for continuous spaces.
- Score: 29.519376857728325
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Most entropy measures depend on the spread of the probability distribution
over the sample space $\mathcal{X}$, and the maximum entropy achievable scales
proportionately with the sample space cardinality $|\mathcal{X}|$. For a finite
$|\mathcal{X}|$, this yields robust entropy measures which satisfy many
important properties, such as invariance to bijections, while the same is not
true for continuous spaces (where $|\mathcal{X}|=\infty$). Furthermore, since
$\mathbb{R}$ and $\mathbb{R}^d$ ($d\in \mathbb{Z}^+$) have the same cardinality
(from Cantor's correspondence argument), cardinality-dependent entropy measures
cannot encode the data dimensionality. In this work, we question the role of
cardinality and distribution spread in defining entropy measures for continuous
spaces, which can undergo multiple rounds of transformations and distortions,
e.g., in neural networks. We find that the average value of the local intrinsic
dimension of a distribution, denoted as ID-Entropy, can serve as a robust
entropy measure for continuous spaces, while capturing the data dimensionality.
We find that ID-Entropy satisfies many desirable properties and can be extended
to conditional entropy, joint entropy and mutual-information variants.
ID-Entropy also yields new information bottleneck principles and also links to
causality. In the context of deep learning, for feedforward architectures, we
show, theoretically and empirically, that the ID-Entropy of a hidden layer
directly controls the generalization gap for both classifiers and
auto-encoders, when the target function is Lipschitz continuous. Our work
primarily shows that, for continuous spaces, taking a structural rather than a
statistical approach yields entropy measures which preserve intrinsic data
dimensionality, while being relevant for studying various architectures.
Related papers
- Designing a Linearized Potential Function in Neural Network Optimization Using Csiszár Type of Tsallis Entropy [0.0]
In this paper, we establish a framework that utilizes a linearized potential function via Csisz'ar type of Tsallis entropy.
We show that our new framework enable us to derive an exponential convergence result.
arXiv Detail & Related papers (2024-11-06T02:12:41Z) - Adaptive Learning of the Latent Space of Wasserstein Generative Adversarial Networks [7.958528596692594]
We propose a novel framework called the latent Wasserstein GAN (LWGAN)
It fuses the Wasserstein auto-encoder and the Wasserstein GAN so that the intrinsic dimension of the data manifold can be adaptively learned.
We show that LWGAN is able to identify the correct intrinsic dimension under several scenarios.
arXiv Detail & Related papers (2024-09-27T01:25:22Z) - Learning with Norm Constrained, Over-parameterized, Two-layer Neural Networks [54.177130905659155]
Recent studies show that a reproducing kernel Hilbert space (RKHS) is not a suitable space to model functions by neural networks.
In this paper, we study a suitable function space for over- parameterized two-layer neural networks with bounded norms.
arXiv Detail & Related papers (2024-04-29T15:04:07Z) - Effective Minkowski Dimension of Deep Nonparametric Regression: Function
Approximation and Statistical Theories [70.90012822736988]
Existing theories on deep nonparametric regression have shown that when the input data lie on a low-dimensional manifold, deep neural networks can adapt to intrinsic data structures.
This paper introduces a relaxed assumption that input data are concentrated around a subset of $mathbbRd$ denoted by $mathcalS$, and the intrinsic dimension $mathcalS$ can be characterized by a new complexity notation -- effective Minkowski dimension.
arXiv Detail & Related papers (2023-06-26T17:13:31Z) - Combating Mode Collapse in GANs via Manifold Entropy Estimation [70.06639443446545]
Generative Adversarial Networks (GANs) have shown compelling results in various tasks and applications.
We propose a novel training pipeline to address the mode collapse issue of GANs.
arXiv Detail & Related papers (2022-08-25T12:33:31Z) - Intrinsic dimension estimation for discrete metrics [65.5438227932088]
In this letter we introduce an algorithm to infer the intrinsic dimension (ID) of datasets embedded in discrete spaces.
We demonstrate its accuracy on benchmark datasets, and we apply it to analyze a metagenomic dataset for species fingerprinting.
This suggests that evolutive pressure acts on a low-dimensional manifold despite the high-dimensionality of sequences' space.
arXiv Detail & Related papers (2022-07-20T06:38:36Z) - Measuring dissimilarity with diffeomorphism invariance [94.02751799024684]
We introduce DID, a pairwise dissimilarity measure applicable to a wide range of data spaces.
We prove that DID enjoys properties which make it relevant for theoretical study and practical use.
arXiv Detail & Related papers (2022-02-11T13:51:30Z) - ChebLieNet: Invariant Spectral Graph NNs Turned Equivariant by
Riemannian Geometry on Lie Groups [9.195729979000404]
ChebLieNet is a group-equivariant method on (anisotropic) manifold.
We develop a graph neural network made of anisotropic convolutional layers.
We empirically prove the existence of (data-dependent) sweet spots for anisotropic parameters on CIFAR10.
arXiv Detail & Related papers (2021-11-23T20:19:36Z) - Asymptotic Causal Inference [6.489113969363787]
We investigate causal inference in the regime as the number of variables approaches infinity using an information-theoretic framework.
We define structural entropy of a causal model in terms of its description complexity measured by the logarithmic growth rate.
We generalize a recently popular bipartite experimental design for studying causal inference on large datasets.
arXiv Detail & Related papers (2021-09-20T16:16:00Z) - Generalized Entropy Regularization or: There's Nothing Special about
Label Smoothing [83.78668073898001]
We introduce a family of entropy regularizers, which includes label smoothing as a special case.
We find that variance in model performance can be explained largely by the resulting entropy of the model.
We advise the use of other entropy regularization methods in its place.
arXiv Detail & Related papers (2020-05-02T12:46:28Z) - On the Estimation of Information Measures of Continuous Distributions [25.395010130602287]
estimation of information measures of continuous distributions based on samples is a fundamental problem in statistics and machine learning.
We provide confidence bounds for simple histogram based estimation of differential entropy from a fixed number of samples.
Our focus is on differential entropy, but we provide examples that show that similar results hold for mutual information and relative entropy as well.
arXiv Detail & Related papers (2020-02-07T15:36:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.