Side-effects of Learning from Low Dimensional Data Embedded in an
Euclidean Space
- URL: http://arxiv.org/abs/2203.00614v1
- Date: Tue, 1 Mar 2022 16:55:51 GMT
- Title: Side-effects of Learning from Low Dimensional Data Embedded in an
Euclidean Space
- Authors: Juncai He, Richard Tsai, Rachel Ward
- Abstract summary: We study the potential regularization effects associated with the network's depth and noise in needs codimension of the data manifold.
We also present additional side effects in training due to the presence of noise.
- Score: 3.093890460224435
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: The low dimensional manifold hypothesis posits that the data found in many
applications, such as those involving natural images, lie (approximately) on
low dimensional manifolds embedded in a high dimensional Euclidean space. In
this setting, a typical neural network defines a function that takes a finite
number of vectors in the embedding space as input. However, one often needs to
consider evaluating the optimized network at points outside the training
distribution. This paper considers the case in which the training data is
distributed in a linear subspace of $\mathbb R^d$. We derive estimates on the
variation of the learning function, defined by a neural network, in the
direction transversal to the subspace. We study the potential regularization
effects associated with the network's depth and noise in the codimension of the
data manifold. We also present additional side effects in training due to the
presence of noise.
Related papers
- Asymptotics of Learning with Deep Structured (Random) Features [9.366617422860543]
For a large class of feature maps we provide a tight characterisation of the test error associated with learning the readout layer.
In some cases our results can capture feature maps learned by deep, finite-width neural networks trained under gradient descent.
arXiv Detail & Related papers (2024-02-21T18:35:27Z) - Variation Spaces for Multi-Output Neural Networks: Insights on Multi-Task Learning and Network Compression [28.851519959657466]
This paper introduces a novel theoretical framework for the analysis of vector-valued neural networks.
A key contribution of this work is the development of a representer theorem for the vector-valued variation spaces.
This observation reveals that the norm associated with these vector-valued variation spaces encourages the learning of features that are useful for multiple tasks.
arXiv Detail & Related papers (2023-05-25T23:32:10Z) - Bayesian Interpolation with Deep Linear Networks [92.1721532941863]
Characterizing how neural network depth, width, and dataset size jointly impact model quality is a central problem in deep learning theory.
We show that linear networks make provably optimal predictions at infinite depth.
We also show that with data-agnostic priors, Bayesian model evidence in wide linear networks is maximized at infinite depth.
arXiv Detail & Related papers (2022-12-29T20:57:46Z) - Effects of Data Geometry in Early Deep Learning [16.967930721746672]
Deep neural networks can approximate functions on different types of data, from images to graphs, with varied underlying structure.
We study how a randomly neural network with piece-wise linear activation splits the data manifold into regions where the neural network behaves as a linear function.
arXiv Detail & Related papers (2022-12-29T17:32:05Z) - Learning Low Dimensional State Spaces with Overparameterized Recurrent
Neural Nets [57.06026574261203]
We provide theoretical evidence for learning low-dimensional state spaces, which can also model long-term memory.
Experiments corroborate our theory, demonstrating extrapolation via learning low-dimensional state spaces with both linear and non-linear RNNs.
arXiv Detail & Related papers (2022-10-25T14:45:15Z) - Intrinsic dimension estimation for discrete metrics [65.5438227932088]
In this letter we introduce an algorithm to infer the intrinsic dimension (ID) of datasets embedded in discrete spaces.
We demonstrate its accuracy on benchmark datasets, and we apply it to analyze a metagenomic dataset for species fingerprinting.
This suggests that evolutive pressure acts on a low-dimensional manifold despite the high-dimensionality of sequences' space.
arXiv Detail & Related papers (2022-07-20T06:38:36Z) - A singular Riemannian geometry approach to Deep Neural Networks II.
Reconstruction of 1-D equivalence classes [78.120734120667]
We build the preimage of a point in the output manifold in the input space.
We focus for simplicity on the case of neural networks maps from n-dimensional real spaces to (n - 1)-dimensional real spaces.
arXiv Detail & Related papers (2021-12-17T11:47:45Z) - Exploring the Common Principal Subspace of Deep Features in Neural
Networks [50.37178960258464]
We find that different Deep Neural Networks (DNNs) trained with the same dataset share a common principal subspace in latent spaces.
Specifically, we design a new metric $mathcalP$-vector to represent the principal subspace of deep features learned in a DNN.
Small angles (with cosine close to $1.0$) have been found in the comparisons between any two DNNs trained with different algorithms/architectures.
arXiv Detail & Related papers (2021-10-06T15:48:32Z) - Near-Minimax Optimal Estimation With Shallow ReLU Neural Networks [19.216784367141972]
We study the problem of estimating an unknown function from noisy data using shallow (single-hidden layer) ReLU neural networks.
We quantify the performance of these neural network estimators when the data-generating function belongs to the space of functions of second-order bounded variation in the Radon domain.
arXiv Detail & Related papers (2021-09-18T05:56:06Z) - A Local Similarity-Preserving Framework for Nonlinear Dimensionality
Reduction with Neural Networks [56.068488417457935]
We propose a novel local nonlinear approach named Vec2vec for general purpose dimensionality reduction.
To train the neural network, we build the neighborhood similarity graph of a matrix and define the context of data points.
Experiments of data classification and clustering on eight real datasets show that Vec2vec is better than several classical dimensionality reduction methods in the statistical hypothesis test.
arXiv Detail & Related papers (2021-03-10T23:10:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.