Estimating Dimensionality of Neural Representations from Finite Samples
- URL: http://arxiv.org/abs/2509.26560v1
- Date: Tue, 30 Sep 2025 17:26:22 GMT
- Title: Estimating Dimensionality of Neural Representations from Finite Samples
- Authors: Chanwoo Chun, Abdulkadir Canatar, SueYeon Chung, Daniel Lee,
- Abstract summary: We show that the participation ratio of eigenvalues, a popular measure of global dimensionality, is highly biased with small sample sizes.<n>We propose a bias-corrected estimator that is more accurate with finite samples and with noise.<n>We apply our estimator to neural brain recordings, including calcium imaging, electrophysiological recordings, and fMRI data.
- Score: 9.91301674137102
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The global dimensionality of a neural representation manifold provides rich insight into the computational process underlying both artificial and biological neural networks. However, all existing measures of global dimensionality are sensitive to the number of samples, i.e., the number of rows and columns of the sample matrix. We show that, in particular, the participation ratio of eigenvalues, a popular measure of global dimensionality, is highly biased with small sample sizes, and propose a bias-corrected estimator that is more accurate with finite samples and with noise. On synthetic data examples, we demonstrate that our estimator can recover the true known dimensionality. We apply our estimator to neural brain recordings, including calcium imaging, electrophysiological recordings, and fMRI data, and to the neural activations in a large language model and show our estimator is invariant to the sample size. Finally, our estimators can additionally be used to measure the local dimensionalities of curved neural manifolds by weighting the finite samples appropriately.
Related papers
- Efficient reconstruction of multidimensional random field models with heterogeneous data using stochastic neural networks [0.0]
We prove a generalization error bound for reconstructing multidimensional random field models on training neural networks with a limited number of training data.<n>Our results indicate that when noise is heterogeneous across dimensions, the convergence rate of the generalization error may not depend explicitly on the model's dimensionality.<n>We show that our Wasserstein-distance approach can successfully train neural networks to learn multidimensional uncertainty models.
arXiv Detail & Related papers (2025-11-17T23:13:07Z) - Neighborhood Sampling Does Not Learn the Same Graph Neural Network [7.312174450290588]
Neighborhood sampling is an important ingredient in the training of large-scale graph neural networks.<n>We study several established neighborhood sampling approaches and the corresponding posterior GP.
arXiv Detail & Related papers (2025-09-26T19:28:13Z) - Disentangle Sample Size and Initialization Effect on Perfect Generalization for Single-Neuron Target [2.8948274245812335]
We focus on a single-neuron target recovery scenario in two-layer neural networks.
Our experiments reveal that a smaller scale is associated with improved generalization.
Our results indicate a transition in the model's ability to recover the target function.
arXiv Detail & Related papers (2024-05-22T16:12:28Z) - Neural network enhanced measurement efficiency for molecular
groundstates [63.36515347329037]
We adapt common neural network models to learn complex groundstate wavefunctions for several molecular qubit Hamiltonians.
We find that using a neural network model provides a robust improvement over using single-copy measurement outcomes alone to reconstruct observables.
arXiv Detail & Related papers (2022-06-30T17:45:05Z) - Efficient quantum state tomography with convolutional neural networks [0.0]
We develop a quantum state tomography scheme which relies on approxing the probability distribution over the outcomes of an informationally complete measurement.
It achieves a reduction of the estimation error of observables by up to an order of magnitude compared to their direct estimation from experimental data.
arXiv Detail & Related papers (2021-09-28T14:55:54Z) - Efficient Multidimensional Functional Data Analysis Using Marginal
Product Basis Systems [2.4554686192257424]
We propose a framework for learning continuous representations from a sample of multidimensional functional data.
We show that the resulting estimation problem can be solved efficiently by the tensor decomposition.
We conclude with a real data application in neuroimaging.
arXiv Detail & Related papers (2021-07-30T16:02:15Z) - Pure Exploration in Kernel and Neural Bandits [90.23165420559664]
We study pure exploration in bandits, where the dimension of the feature representation can be much larger than the number of arms.
To overcome the curse of dimensionality, we propose to adaptively embed the feature representation of each arm into a lower-dimensional space.
arXiv Detail & Related papers (2021-06-22T19:51:59Z) - Intrinsic Dimension Estimation [92.87600241234344]
We introduce a new estimator of the intrinsic dimension and provide finite sample, non-asymptotic guarantees.
We then apply our techniques to get new sample complexity bounds for Generative Adversarial Networks (GANs) depending on the intrinsic dimension of the data.
arXiv Detail & Related papers (2021-06-08T00:05:39Z) - Estimating informativeness of samples with Smooth Unique Information [108.25192785062367]
We measure how much a sample informs the final weights and how much it informs the function computed by the weights.
We give efficient approximations of these quantities using a linearized network.
We apply these measures to several problems, such as dataset summarization.
arXiv Detail & Related papers (2021-01-17T10:29:29Z) - Intrinsic Dimensionality Explains the Effectiveness of Language Model
Fine-Tuning [52.624194343095304]
We argue that analyzing fine-tuning through the lens of intrinsic dimension provides us with empirical and theoretical intuitions.
We empirically show that common pre-trained models have a very low intrinsic dimension.
arXiv Detail & Related papers (2020-12-22T07:42:30Z) - Deep Dimension Reduction for Supervised Representation Learning [51.10448064423656]
We propose a deep dimension reduction approach to learning representations with essential characteristics.
The proposed approach is a nonparametric generalization of the sufficient dimension reduction method.
We show that the estimated deep nonparametric representation is consistent in the sense that its excess risk converges to zero.
arXiv Detail & Related papers (2020-06-10T14:47:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.