Neural Bayes: A Generic Parameterization Method for Unsupervised
Representation Learning
- URL: http://arxiv.org/abs/2002.09046v1
- Date: Thu, 20 Feb 2020 22:28:53 GMT
- Title: Neural Bayes: A Generic Parameterization Method for Unsupervised
Representation Learning
- Authors: Devansh Arpit, Huan Wang, Caiming Xiong, Richard Socher, Yoshua Bengio
- Abstract summary: We introduce a parameterization method called Neural Bayes.
It allows computing statistical quantities that are in general difficult to compute.
We show two independent use cases for this parameterization.
- Score: 175.34232468746245
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce a parameterization method called Neural Bayes which allows
computing statistical quantities that are in general difficult to compute and
opens avenues for formulating new objectives for unsupervised representation
learning. Specifically, given an observed random variable $\mathbf{x}$ and a
latent discrete variable $z$, we can express $p(\mathbf{x}|z)$,
$p(z|\mathbf{x})$ and $p(z)$ in closed form in terms of a sufficiently
expressive function (Eg. neural network) using our parameterization without
restricting the class of these distributions. To demonstrate its usefulness, we
develop two independent use cases for this parameterization:
1. Mutual Information Maximization (MIM): MIM has become a popular means for
self-supervised representation learning. Neural Bayes allows us to compute
mutual information between observed random variables $\mathbf{x}$ and latent
discrete random variables $z$ in closed form. We use this for learning image
representations and show its usefulness on downstream classification tasks.
2. Disjoint Manifold Labeling: Neural Bayes allows us to formulate an
objective which can optimally label samples from disjoint manifolds present in
the support of a continuous distribution. This can be seen as a specific form
of clustering where each disjoint manifold in the support is a separate
cluster. We design clustering tasks that obey this formulation and empirically
show that the model optimally labels the disjoint manifolds. Our code is
available at \url{https://github.com/salesforce/NeuralBayes}
Related papers
- Dimension-free Private Mean Estimation for Anisotropic Distributions [55.86374912608193]
Previous private estimators on distributions over $mathRd suffer from a curse of dimensionality.
We present an algorithm whose sample complexity has improved dependence on dimension.
arXiv Detail & Related papers (2024-11-01T17:59:53Z) - SymmetricDiffusers: Learning Discrete Diffusion on Finite Symmetric Groups [14.925722398371498]
We introduce a novel discrete diffusion model that simplifies the task of learning a complicated distribution over $S_n$.
Our model achieves state-of-the-art or comparable performances on solving tasks including sorting 4-digit MNIST images.
arXiv Detail & Related papers (2024-10-03T19:37:40Z) - Biology-inspired joint distribution neurons based on Hierarchical Correlation Reconstruction allowing for multidirectional neural networks [0.49728186750345144]
Novel artificial neurons based on HCR (Hierarchical Correlation Reconstruction)
Network can also propagate probability distributions (also joint) like $rho(y,z|x)
arXiv Detail & Related papers (2024-05-08T14:49:27Z) - On counterfactual inference with unobserved confounding [36.18241676876348]
Given an observational study with $n$ independent but heterogeneous units, our goal is to learn the counterfactual distribution for each unit.
We introduce a convex objective that pools all $n$ samples to jointly learn all $n$ parameter vectors.
We derive sufficient conditions for compactly supported distributions to satisfy the logarithmic Sobolev inequality.
arXiv Detail & Related papers (2022-11-14T04:14:37Z) - Equivariance with Learned Canonicalization Functions [77.32483958400282]
We show that learning a small neural network to perform canonicalization is better than using predefineds.
Our experiments show that learning the canonicalization function is competitive with existing techniques for learning equivariant functions across many tasks.
arXiv Detail & Related papers (2022-11-11T21:58:15Z) - Neural Implicit Manifold Learning for Topology-Aware Density Estimation [15.878635603835063]
Current generative models learn $mathcalM$ by mapping an $m$-dimensional latent variable through a neural network.
We show that our model can learn manifold-supported distributions with complex topologies more accurately than pushforward models.
arXiv Detail & Related papers (2022-06-22T18:00:00Z) - Diffusion models as plug-and-play priors [98.16404662526101]
We consider the problem of inferring high-dimensional data $mathbfx$ in a model that consists of a prior $p(mathbfx)$ and an auxiliary constraint $c(mathbfx,mathbfy)$.
The structure of diffusion models allows us to perform approximate inference by iterating differentiation through the fixed denoising network enriched with different amounts of noise.
arXiv Detail & Related papers (2022-06-17T21:11:36Z) - The Separation Capacity of Random Neural Networks [78.25060223808936]
We show that a sufficiently large two-layer ReLU-network with standard Gaussian weights and uniformly distributed biases can solve this problem with high probability.
We quantify the relevant structure of the data in terms of a novel notion of mutual complexity.
arXiv Detail & Related papers (2021-07-31T10:25:26Z) - For Manifold Learning, Deep Neural Networks can be Locality Sensitive
Hash Functions [14.347610075713412]
We show that neural representations can be viewed as LSH-like functions that map each input to an embedding.
An important consequence of this behavior is one-shot learning to unseen classes.
arXiv Detail & Related papers (2021-03-11T18:57:47Z) - Agnostic Learning of a Single Neuron with Gradient Descent [92.7662890047311]
We consider the problem of learning the best-fitting single neuron as measured by the expected square loss.
For the ReLU activation, our population risk guarantee is $O(mathsfOPT1/2)+epsilon$.
For the ReLU activation, our population risk guarantee is $O(mathsfOPT1/2)+epsilon$.
arXiv Detail & Related papers (2020-05-29T07:20:35Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.