Intrinsic Dimension Estimation
- URL: http://arxiv.org/abs/2106.04018v1
- Date: Tue, 8 Jun 2021 00:05:39 GMT
- Title: Intrinsic Dimension Estimation
- Authors: Adam Block, Zeyu Jia, Yury Polyanskiy, and Alexander Rakhlin
- Abstract summary: We introduce a new estimator of the intrinsic dimension and provide finite sample, non-asymptotic guarantees.
We then apply our techniques to get new sample complexity bounds for Generative Adversarial Networks (GANs) depending on the intrinsic dimension of the data.
- Score: 92.87600241234344
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: It has long been thought that high-dimensional data encountered in many
practical machine learning tasks have low-dimensional structure, i.e., the
manifold hypothesis holds. A natural question, thus, is to estimate the
intrinsic dimension of a given population distribution from a finite sample. We
introduce a new estimator of the intrinsic dimension and provide finite sample,
non-asymptotic guarantees. We then apply our techniques to get new sample
complexity bounds for Generative Adversarial Networks (GANs) depending only on
the intrinsic dimension of the data.
Related papers
- Relative intrinsic dimensionality is intrinsic to learning [49.5738281105287]
We introduce a new notion of the intrinsic dimension of a data distribution, which precisely captures the separability properties of the data.
For this intrinsic dimension, the rule of thumb above becomes a law: high intrinsic dimension guarantees highly separable data.
We show thisRelative intrinsic dimension provides both upper and lower bounds on the probability of successfully learning and generalising in a binary classification problem.
arXiv Detail & Related papers (2023-10-10T10:41:45Z) - Estimating Shape Distances on Neural Representations with Limited
Samples [5.959815242044236]
We develop a rigorous statistical theory for high-dimensional shape estimation.
We show that this estimator achieves lower bias than on neurals simulation data, particularly in high-dimensional settings.
arXiv Detail & Related papers (2023-10-09T14:16:34Z) - On Deep Generative Models for Approximation and Estimation of
Distributions on Manifolds [38.311376714689]
Generative networks can generate high-dimensional complex data from a low-dimensional easy-to-sample distribution.
We take such low-dimensional data structures into consideration by assuming that data distributions are supported on a low-dimensional manifold.
We show that the Wasserstein-1 loss converges to zero at a fast rate depending on the intrinsic dimension instead of the ambient data dimension.
arXiv Detail & Related papers (2023-02-25T22:34:19Z) - Intrinsic Dimensionality Estimation within Tight Localities: A
Theoretical and Experimental Analysis [0.0]
We propose a local ID estimation strategy stable even for tight' localities consisting of as few as 20 sample points.
Our experimental results show that our proposed estimation technique can achieve notably smaller variance, while maintaining comparable levels of bias, at much smaller sample sizes than state-of-the-art estimators.
arXiv Detail & Related papers (2022-09-29T00:00:11Z) - Intrinsic dimension estimation for discrete metrics [65.5438227932088]
In this letter we introduce an algorithm to infer the intrinsic dimension (ID) of datasets embedded in discrete spaces.
We demonstrate its accuracy on benchmark datasets, and we apply it to analyze a metagenomic dataset for species fingerprinting.
This suggests that evolutive pressure acts on a low-dimensional manifold despite the high-dimensionality of sequences' space.
arXiv Detail & Related papers (2022-07-20T06:38:36Z) - Pure Exploration in Kernel and Neural Bandits [90.23165420559664]
We study pure exploration in bandits, where the dimension of the feature representation can be much larger than the number of arms.
To overcome the curse of dimensionality, we propose to adaptively embed the feature representation of each arm into a lower-dimensional space.
arXiv Detail & Related papers (2021-06-22T19:51:59Z) - A Local Similarity-Preserving Framework for Nonlinear Dimensionality
Reduction with Neural Networks [56.068488417457935]
We propose a novel local nonlinear approach named Vec2vec for general purpose dimensionality reduction.
To train the neural network, we build the neighborhood similarity graph of a matrix and define the context of data points.
Experiments of data classification and clustering on eight real datasets show that Vec2vec is better than several classical dimensionality reduction methods in the statistical hypothesis test.
arXiv Detail & Related papers (2021-03-10T23:10:47Z) - A Topological Approach to Inferring the Intrinsic Dimension of Convex
Sensing Data [0.0]
We consider a common measurement paradigm, where an unknown subset of an affine space is measured by unknown quasi- filtration functions.
In this paper, we develop a method for inferring the dimension of the data under natural assumptions.
arXiv Detail & Related papers (2020-07-07T05:35:23Z) - Deep Dimension Reduction for Supervised Representation Learning [51.10448064423656]
We propose a deep dimension reduction approach to learning representations with essential characteristics.
The proposed approach is a nonparametric generalization of the sufficient dimension reduction method.
We show that the estimated deep nonparametric representation is consistent in the sense that its excess risk converges to zero.
arXiv Detail & Related papers (2020-06-10T14:47:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.