On consistent estimation of dimension values
- URL: http://arxiv.org/abs/2412.13898v2
- Date: Fri, 04 Jul 2025 12:13:00 GMT
- Title: On consistent estimation of dimension values
- Authors: Alejandro Cholaquidis, Antonio Cuevas, Beatriz Pateiro-López,
- Abstract summary: A problem of estimating from a random sample of points, the dimension of a compact subset $S$ of the Euclidean space is considered.<n>We focus on three notions: the Minkowski dimension, the correlation dimension and the concept of pointwise dimension.<n>In particular, we explore the case in which the true volume function $V(r)$ of the target set $S$ is a on some interval starting at zero.
- Score: 45.52331418900137
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The problem of estimating, from a random sample of points, the dimension of a compact subset $S$ of the Euclidean space is considered. The emphasis is put on consistency results in the statistical sense. That is, statements of convergence to the true dimension value when the sample size grows to infinity. Among the many available definitions of dimension, we have focused (on the grounds of its statistical tractability) on three notions: the Minkowski dimension, the correlation dimension and the, perhaps less popular, concept of pointwise dimension. We prove the statistical consistency of some natural estimators of these quantities. Our proofs partially rely on the use of an instrumental estimator formulated in terms of the empirical volume function $V_n(r)$, defined as the Lebesgue measure of the set of points whose distance to the sample is at most $r$. In particular, we explore the case in which the true volume function $V(r)$ of the target set $S$ is a polynomial on some interval starting at zero. An empirical study is also included. Our study aims to provide some theoretical support, and some practical insights, for the problem of deciding whether or not the set $S$ has a dimension smaller than that of the ambient space. This is a major statistical motivation of the dimension studies, in connection with the so-called ``Manifold Hypothesis''.
Related papers
- Spherical dimension [15.07787640047213]
Spherical dimension is a natural relaxation of the VC dimension.
Spherical dimension serves as a common foundation for leveraging the Borsuk-Ulam theorem and related topological tools.
arXiv Detail & Related papers (2025-03-13T10:32:25Z) - A Statistical Analysis for Supervised Deep Learning with Exponential Families for Intrinsically Low-dimensional Data [32.98264375121064]
We consider supervised deep learning when the given explanatory variable is distributed according to an exponential family.<n>Under the assumption of an upper-bounded density of the explanatory variables, we characterize the rate of convergence as $tildemathcalOleft( dfrac2lfloorbetarfloor(beta + d)2beta + dn-frac22beta + dn-frac22beta + dn-frac22beta + dn-
arXiv Detail & Related papers (2024-12-13T01:15:17Z) - Blessing of Dimensionality for Approximating Sobolev Classes on Manifolds [14.183849746284816]
We consider optimal uniform approximations with functions of finite statistical complexity.<n>In particular, we demonstrate that the statistical complexity required to approximate a class of bounded Sobolev functions on a compact manifold is bounded from below.
arXiv Detail & Related papers (2024-08-13T15:56:42Z) - Sparse PCA with Oracle Property [115.72363972222622]
We propose a family of estimators based on the semidefinite relaxation of sparse PCA with novel regularizations.
We prove that, another estimator within the family achieves a sharper statistical rate of convergence than the standard semidefinite relaxation of sparse PCA.
arXiv Detail & Related papers (2023-12-28T02:52:54Z) - The signaling dimension in generalized probabilistic theories [48.99818550820575]
The signaling dimension of a given physical system quantifies the minimum dimension of a classical system required to reproduce all input/output correlations of the given system.
We show that it suffices to consider extremal measurements with rayextremal effects, and we bound the number of elements of any such measurement in terms of the linear dimension.
For systems with a finite number of extremal effects, we recast the problem of characterizing the extremal measurements with ray-extremal effects.
arXiv Detail & Related papers (2023-11-22T02:09:16Z) - Effective Minkowski Dimension of Deep Nonparametric Regression: Function
Approximation and Statistical Theories [70.90012822736988]
Existing theories on deep nonparametric regression have shown that when the input data lie on a low-dimensional manifold, deep neural networks can adapt to intrinsic data structures.
This paper introduces a relaxed assumption that input data are concentrated around a subset of $mathbbRd$ denoted by $mathcalS$, and the intrinsic dimension $mathcalS$ can be characterized by a new complexity notation -- effective Minkowski dimension.
arXiv Detail & Related papers (2023-06-26T17:13:31Z) - Evolution of many-body systems under ancilla quantum measurements [58.720142291102135]
We study the concept of implementing quantum measurements by coupling a many-body lattice system to an ancillary degree of freedom.
We find evidence of a disentangling-entangling measurement-induced transition as was previously observed in more abstract models.
arXiv Detail & Related papers (2023-03-13T13:06:40Z) - Intrinsic Dimensionality Estimation within Tight Localities: A
Theoretical and Experimental Analysis [0.0]
We propose a local ID estimation strategy stable even for tight' localities consisting of as few as 20 sample points.
Our experimental results show that our proposed estimation technique can achieve notably smaller variance, while maintaining comparable levels of bias, at much smaller sample sizes than state-of-the-art estimators.
arXiv Detail & Related papers (2022-09-29T00:00:11Z) - Tangent Space and Dimension Estimation with the Wasserstein Distance [10.118241139691952]
Consider a set of points sampled independently near a smooth compact submanifold of Euclidean space.
We provide mathematically rigorous bounds on the number of sample points required to estimate both the dimension and the tangent spaces of that manifold.
arXiv Detail & Related papers (2021-10-12T21:02:06Z) - Limit Distribution Theory for the Smooth 1-Wasserstein Distance with
Applications [18.618590805279187]
smooth 1-Wasserstein distance (SWD) $W_1sigma$ was recently proposed as a means to mitigate the curse of dimensionality in empirical approximation.
This work conducts a thorough statistical study of the SWD, including a high-dimensional limit distribution result.
arXiv Detail & Related papers (2021-07-28T17:02:24Z) - Manifold Hypothesis in Data Analysis: Double Geometrically-Probabilistic
Approach to Manifold Dimension Estimation [92.81218653234669]
We present new approach to manifold hypothesis checking and underlying manifold dimension estimation.
Our geometrical method is a modification for sparse data of a well-known box-counting algorithm for Minkowski dimension calculation.
Experiments on real datasets show that the suggested approach based on two methods combination is powerful and effective.
arXiv Detail & Related papers (2021-07-08T15:35:54Z) - Intrinsic Dimension Estimation [92.87600241234344]
We introduce a new estimator of the intrinsic dimension and provide finite sample, non-asymptotic guarantees.
We then apply our techniques to get new sample complexity bounds for Generative Adversarial Networks (GANs) depending on the intrinsic dimension of the data.
arXiv Detail & Related papers (2021-06-08T00:05:39Z) - A Class of Dimension-free Metrics for the Convergence of Empirical
Measures [6.253771639590562]
We show that under the proposed metrics, the convergence of empirical measures in high dimensions is free of the curse of dimensionality (CoD)
Examples of selected test function spaces include the kernel reproducing Hilbert spaces, Barron space, and flow-induced function spaces.
We show that the proposed class of metrics is a powerful tool to analyze the convergence of empirical measures in high dimensions without CoD.
arXiv Detail & Related papers (2021-04-24T23:27:40Z) - Concentration estimates for random subspaces of a tensor product, and
application to Quantum Information Theory [0.0]
Given a random subspace $H_n$ chosen uniformly in a tensor product of Hilbert spaces $V_notimes W$, we consider the collection $K_n$ of all singular values of all norm one elements of $H_n$.
A law of large numbers has been obtained for this random set in the context of $W$ fixed and the dimension of $H_n$ and $V_n$ tending to infinity at the same speed.
arXiv Detail & Related papers (2020-11-30T23:22:55Z) - A Topological Approach to Inferring the Intrinsic Dimension of Convex
Sensing Data [0.0]
We consider a common measurement paradigm, where an unknown subset of an affine space is measured by unknown quasi- filtration functions.
In this paper, we develop a method for inferring the dimension of the data under natural assumptions.
arXiv Detail & Related papers (2020-07-07T05:35:23Z) - Interpolation and Learning with Scale Dependent Kernels [91.41836461193488]
We study the learning properties of nonparametric ridge-less least squares.
We consider the common case of estimators defined by scale dependent kernels.
arXiv Detail & Related papers (2020-06-17T16:43:37Z) - Geometry of Similarity Comparisons [51.552779977889045]
We show that the ordinal capacity of a space form is related to its dimension and the sign of its curvature.
More importantly, we show that the statistical behavior of the ordinal spread random variables defined on a similarity graph can be used to identify its underlying space form.
arXiv Detail & Related papers (2020-06-17T13:37:42Z) - A Concentration of Measure and Random Matrix Approach to Large
Dimensional Robust Statistics [45.24358490877106]
This article studies the emphrobust covariance matrix estimation of a data collection $X = (x_1,ldots,x_n)$ with $x_i = sqrt tau_i z_i + m$.
We exploit this semi-metric along with concentration of measure arguments to prove the existence and uniqueness of the robust estimator as well as evaluate its limiting spectral distribution.
arXiv Detail & Related papers (2020-06-17T09:02:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.