For Manifold Learning, Deep Neural Networks can be Locality Sensitive
Hash Functions
- URL: http://arxiv.org/abs/2103.06875v1
- Date: Thu, 11 Mar 2021 18:57:47 GMT
- Title: For Manifold Learning, Deep Neural Networks can be Locality Sensitive
Hash Functions
- Authors: Nishanth Dikkala, Gal Kaplun, Rina Panigrahy
- Abstract summary: We show that neural representations can be viewed as LSH-like functions that map each input to an embedding.
An important consequence of this behavior is one-shot learning to unseen classes.
- Score: 14.347610075713412
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: It is well established that training deep neural networks gives useful
representations that capture essential features of the inputs. However, these
representations are poorly understood in theory and practice. In the context of
supervised learning an important question is whether these representations
capture features informative for classification, while filtering out
non-informative noisy ones. We explore a formalization of this question by
considering a generative process where each class is associated with a
high-dimensional manifold and different classes define different manifolds.
Under this model, each input is produced using two latent vectors: (i) a
"manifold identifier" $\gamma$ and; (ii)~a "transformation parameter" $\theta$
that shifts examples along the surface of a manifold. E.g., $\gamma$ might
represent a canonical image of a dog, and $\theta$ might stand for variations
in pose, background or lighting. We provide theoretical and empirical evidence
that neural representations can be viewed as LSH-like functions that map each
input to an embedding that is a function of solely the informative $\gamma$ and
invariant to $\theta$, effectively recovering the manifold identifier $\gamma$.
An important consequence of this behavior is one-shot learning to unseen
classes.
Related papers
- The Optimization Landscape of SGD Across the Feature Learning Strength [102.1353410293931]
We study the effect of scaling $gamma$ across a variety of models and datasets in the online training setting.
We find that optimal online performance is often found at large $gamma$.
Our findings indicate that analytical study of the large-$gamma$ limit may yield useful insights into the dynamics of representation learning in performant models.
arXiv Detail & Related papers (2024-10-06T22:30:14Z) - Visualising Feature Learning in Deep Neural Networks by Diagonalizing the Forward Feature Map [4.776836972093627]
We present a method for analysing feature learning by decomposing deep neural networks (DNNs)
We find that DNNs converge to a minimal feature (MF) regime dominated by a number of eigenfunctions equal to the number of classes.
We recast the phenomenon of neural collapse into a kernel picture which can be extended to broader tasks such as regression.
arXiv Detail & Related papers (2024-10-05T18:53:48Z) - GRIL: A $2$-parameter Persistence Based Vectorization for Machine
Learning [0.49703640686206074]
We introduce a novel vector representation called Generalized Rank Invariant Landscape (GRIL) for $2$- parameter persistence modules.
We show that this vector representation is $1$-Lipschitz stable and differentiable with respect to underlying filtration functions.
We also observe an increase in performance indicating that GRIL can capture additional features enriching Graph Neural Networks (GNNs)
arXiv Detail & Related papers (2023-04-11T04:30:58Z) - Neural Networks can Learn Representations with Gradient Descent [68.95262816363288]
In specific regimes, neural networks trained by gradient descent behave like kernel methods.
In practice, it is known that neural networks strongly outperform their associated kernels.
arXiv Detail & Related papers (2022-06-30T09:24:02Z) - Learning sparse features can lead to overfitting in neural networks [9.2104922520782]
We show that feature learning can perform worse than lazy training.
Although sparsity is known to be essential for learning anisotropic data, it is detrimental when the target function is constant or smooth.
arXiv Detail & Related papers (2022-06-24T14:26:33Z) - Improving Robustness and Generality of NLP Models Using Disentangled
Representations [62.08794500431367]
Supervised neural networks first map an input $x$ to a single representation $z$, and then map $z$ to the output label $y$.
We present methods to improve robustness and generality of NLP models from the standpoint of disentangled representation learning.
We show that models trained with the proposed criteria provide better robustness and domain adaptation ability in a wide range of supervised learning tasks.
arXiv Detail & Related papers (2020-09-21T02:48:46Z) - Towards Understanding Hierarchical Learning: Benefits of Neural
Representations [160.33479656108926]
In this work, we demonstrate that intermediate neural representations add more flexibility to neural networks.
We show that neural representation can achieve improved sample complexities compared with the raw input.
Our results characterize when neural representations are beneficial, and may provide a new perspective on why depth is important in deep learning.
arXiv Detail & Related papers (2020-06-24T02:44:54Z) - Neural Bayes: A Generic Parameterization Method for Unsupervised
Representation Learning [175.34232468746245]
We introduce a parameterization method called Neural Bayes.
It allows computing statistical quantities that are in general difficult to compute.
We show two independent use cases for this parameterization.
arXiv Detail & Related papers (2020-02-20T22:28:53Z) - Backward Feature Correction: How Deep Learning Performs Deep
(Hierarchical) Learning [66.05472746340142]
This paper analyzes how multi-layer neural networks can perform hierarchical learning _efficiently_ and _automatically_ by SGD on the training objective.
We establish a new principle called "backward feature correction", where the errors in the lower-level features can be automatically corrected when training together with the higher-level layers.
arXiv Detail & Related papers (2020-01-13T17:28:29Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.