Related papers: For Manifold Learning, Deep Neural Networks can be Locality Sensitive Hash Functions

For Manifold Learning, Deep Neural Networks can be Locality Sensitive Hash Functions

URL: http://arxiv.org/abs/2103.06875v1
Date: Thu, 11 Mar 2021 18:57:47 GMT
Title: For Manifold Learning, Deep Neural Networks can be Locality Sensitive Hash Functions
Authors: Nishanth Dikkala, Gal Kaplun, Rina Panigrahy
Abstract summary: We show that neural representations can be viewed as LSH-like functions that map each input to an embedding. An important consequence of this behavior is one-shot learning to unseen classes.
Score: 14.347610075713412
License: http://creativecommons.org/licenses/by/4.0/
Abstract: It is well established that training deep neural networks gives useful representations that capture essential features of the inputs. However, these representations are poorly understood in theory and practice. In the context of supervised learning an important question is whether these representations capture features informative for classification, while filtering out non-informative noisy ones. We explore a formalization of this question by considering a generative process where each class is associated with a high-dimensional manifold and different classes define different manifolds. Under this model, each input is produced using two latent vectors: (i) a "manifold identifier" $\gamma$ and; (ii)~a "transformation parameter" $\theta$ that shifts examples along the surface of a manifold. E.g., $\gamma$ might represent a canonical image of a dog, and $\theta$ might stand for variations in pose, background or lighting. We provide theoretical and empirical evidence that neural representations can be viewed as LSH-like functions that map each input to an embedding that is a function of solely the informative $\gamma$ and invariant to $\theta$, effectively recovering the manifold identifier $\gamma$. An important consequence of this behavior is one-shot learning to unseen classes.

Related papers

The Optimization Landscape of SGD Across the Feature Learning Strength [102.1353410293931]
We study the effect of scaling $gamma$ across a variety of models and datasets in the online training setting. We find that optimal online performance is often found at large $gamma$. Our findings indicate that analytical study of the large-$gamma$ limit may yield useful insights into the dynamics of representation learning in performant models.
arXiv Detail & Related papers (2024-10-06T22:30:14Z)
Visualising Feature Learning in Deep Neural Networks by Diagonalizing the Forward Feature Map [4.776836972093627]
We present a method for analysing feature learning by decomposing deep neural networks (DNNs) We find that DNNs converge to a minimal feature (MF) regime dominated by a number of eigenfunctions equal to the number of classes. We recast the phenomenon of neural collapse into a kernel picture which can be extended to broader tasks such as regression.
arXiv Detail & Related papers (2024-10-05T18:53:48Z)
GRIL: A $2$-parameter Persistence Based Vectorization for Machine Learning [0.49703640686206074]
We introduce a novel vector representation called Generalized Rank Invariant Landscape (GRIL) for $2$- parameter persistence modules. We show that this vector representation is $1$-Lipschitz stable and differentiable with respect to underlying filtration functions. We also observe an increase in performance indicating that GRIL can capture additional features enriching Graph Neural Networks (GNNs)
arXiv Detail & Related papers (2023-04-11T04:30:58Z)
Neural Networks can Learn Representations with Gradient Descent [68.95262816363288]
In specific regimes, neural networks trained by gradient descent behave like kernel methods. In practice, it is known that neural networks strongly outperform their associated kernels.
arXiv Detail & Related papers (2022-06-30T09:24:02Z)
Learning sparse features can lead to overfitting in neural networks [9.2104922520782]
We show that feature learning can perform worse than lazy training. Although sparsity is known to be essential for learning anisotropic data, it is detrimental when the target function is constant or smooth.
arXiv Detail & Related papers (2022-06-24T14:26:33Z)
Improving Robustness and Generality of NLP Models Using Disentangled Representations [62.08794500431367]
Supervised neural networks first map an input $x$ to a single representation $z$, and then map $z$ to the output label $y$. We present methods to improve robustness and generality of NLP models from the standpoint of disentangled representation learning. We show that models trained with the proposed criteria provide better robustness and domain adaptation ability in a wide range of supervised learning tasks.
arXiv Detail & Related papers (2020-09-21T02:48:46Z)
Towards Understanding Hierarchical Learning: Benefits of Neural Representations [160.33479656108926]
In this work, we demonstrate that intermediate neural representations add more flexibility to neural networks. We show that neural representation can achieve improved sample complexities compared with the raw input. Our results characterize when neural representations are beneficial, and may provide a new perspective on why depth is important in deep learning.
arXiv Detail & Related papers (2020-06-24T02:44:54Z)
Neural Bayes: A Generic Parameterization Method for Unsupervised Representation Learning [175.34232468746245]
We introduce a parameterization method called Neural Bayes. It allows computing statistical quantities that are in general difficult to compute. We show two independent use cases for this parameterization.
arXiv Detail & Related papers (2020-02-20T22:28:53Z)
Backward Feature Correction: How Deep Learning Performs Deep (Hierarchical) Learning [66.05472746340142]
This paper analyzes how multi-layer neural networks can perform hierarchical learning _efficiently_ and _automatically_ by SGD on the training objective. We establish a new principle called "backward feature correction", where the errors in the lower-level features can be automatically corrected when training together with the higher-level layers.
arXiv Detail & Related papers (2020-01-13T17:28:29Z)

This list is automatically generated from the titles and abstracts of the papers in this site.