Related papers: Neural Bayes: A Generic Parameterization Method for Unsupervised Representation Learning

Neural Bayes: A Generic Parameterization Method for Unsupervised Representation Learning

URL: http://arxiv.org/abs/2002.09046v1
Date: Thu, 20 Feb 2020 22:28:53 GMT
Title: Neural Bayes: A Generic Parameterization Method for Unsupervised Representation Learning
Authors: Devansh Arpit, Huan Wang, Caiming Xiong, Richard Socher, Yoshua Bengio
Abstract summary: We introduce a parameterization method called Neural Bayes. It allows computing statistical quantities that are in general difficult to compute. We show two independent use cases for this parameterization.
Score: 175.34232468746245
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: We introduce a parameterization method called Neural Bayes which allows computing statistical quantities that are in general difficult to compute and opens avenues for formulating new objectives for unsupervised representation learning. Specifically, given an observed random variable $\mathbf{x}$ and a latent discrete variable $z$, we can express $p(\mathbf{x}|z)$, $p(z|\mathbf{x})$ and $p(z)$ in closed form in terms of a sufficiently expressive function (Eg. neural network) using our parameterization without restricting the class of these distributions. To demonstrate its usefulness, we develop two independent use cases for this parameterization: 1. Mutual Information Maximization (MIM): MIM has become a popular means for self-supervised representation learning. Neural Bayes allows us to compute mutual information between observed random variables $\mathbf{x}$ and latent discrete random variables $z$ in closed form. We use this for learning image representations and show its usefulness on downstream classification tasks. 2. Disjoint Manifold Labeling: Neural Bayes allows us to formulate an objective which can optimally label samples from disjoint manifolds present in the support of a continuous distribution. This can be seen as a specific form of clustering where each disjoint manifold in the support is a separate cluster. We design clustering tasks that obey this formulation and empirically show that the model optimally labels the disjoint manifolds. Our code is available at \url{https://github.com/salesforce/NeuralBayes}

Related papers

Self-supervised Latent Space Optimization with Nebula Variational Coding [87.20343320266215]
This paper proposes a variational inference model which leads to a clustered embedding.<n>We introduce additional variables in the latent space, called textbfnebula anchors, that guide the latent variables to form clusters during training.<n>Since each latent feature can be labeled with the closest anchor, we also propose to apply metric learning in a self-supervised way to make the separation between clusters more explicit.
arXiv Detail & Related papers (2025-06-02T08:13:32Z)
Outsourced diffusion sampling: Efficient posterior inference in latent spaces of generative models [65.71506381302815]
We propose amortize the cost of sampling from a posterior distribution of the form $p(mathbfxmidmathbfy) propto p_theta(mathbfx)$. For many models and constraints of interest, the posterior in the noise space is smoother than the posterior in the data space, making it more amenable to such amortized inference.
arXiv Detail & Related papers (2025-02-10T19:49:54Z)
Dimension-free Private Mean Estimation for Anisotropic Distributions [55.86374912608193]
Previous private estimators on distributions over $mathRd suffer from a curse of dimensionality. We present an algorithm whose sample complexity has improved dependence on dimension.
arXiv Detail & Related papers (2024-11-01T17:59:53Z)
SymmetricDiffusers: Learning Discrete Diffusion on Finite Symmetric Groups [14.925722398371498]
We introduce a novel discrete diffusion model that simplifies the task of learning a complicated distribution over $S_n$. Our model achieves state-of-the-art or comparable performances on solving tasks including sorting 4-digit MNIST images.
arXiv Detail & Related papers (2024-10-03T19:37:40Z)
Inverse Entropic Optimal Transport Solves Semi-supervised Learning via Data Likelihood Maximization [65.8915778873691]
conditional distributions is a central problem in machine learning.<n>We propose a new paradigm that integrates both paired and unpaired data.<n>We show that our approach can theoretically recover true conditional distributions with arbitrarily small error.
arXiv Detail & Related papers (2024-10-03T16:12:59Z)
Biology-inspired joint distribution neurons based on Hierarchical Correlation Reconstruction allowing for multidirectional neural networks [0.49728186750345144]
Novel artificial neurons based on HCR (Hierarchical Correlation Reconstruction) Network can also propagate probability distributions (also joint) like $rho(y,z|x)
arXiv Detail & Related papers (2024-05-08T14:49:27Z)
On counterfactual inference with unobserved confounding [36.18241676876348]
Given an observational study with $n$ independent but heterogeneous units, our goal is to learn the counterfactual distribution for each unit. We introduce a convex objective that pools all $n$ samples to jointly learn all $n$ parameter vectors. We derive sufficient conditions for compactly supported distributions to satisfy the logarithmic Sobolev inequality.
arXiv Detail & Related papers (2022-11-14T04:14:37Z)
Equivariance with Learned Canonicalization Functions [77.32483958400282]
We show that learning a small neural network to perform canonicalization is better than using predefineds. Our experiments show that learning the canonicalization function is competitive with existing techniques for learning equivariant functions across many tasks.
arXiv Detail & Related papers (2022-11-11T21:58:15Z)
Neural Implicit Manifold Learning for Topology-Aware Density Estimation [15.878635603835063]
Current generative models learn $mathcalM$ by mapping an $m$-dimensional latent variable through a neural network. We show that our model can learn manifold-supported distributions with complex topologies more accurately than pushforward models.
arXiv Detail & Related papers (2022-06-22T18:00:00Z)
Diffusion models as plug-and-play priors [98.16404662526101]
We consider the problem of inferring high-dimensional data $mathbfx$ in a model that consists of a prior $p(mathbfx)$ and an auxiliary constraint $c(mathbfx,mathbfy)$. The structure of diffusion models allows us to perform approximate inference by iterating differentiation through the fixed denoising network enriched with different amounts of noise.
arXiv Detail & Related papers (2022-06-17T21:11:36Z)
The Separation Capacity of Random Neural Networks [78.25060223808936]
We show that a sufficiently large two-layer ReLU-network with standard Gaussian weights and uniformly distributed biases can solve this problem with high probability. We quantify the relevant structure of the data in terms of a novel notion of mutual complexity.
arXiv Detail & Related papers (2021-07-31T10:25:26Z)
For Manifold Learning, Deep Neural Networks can be Locality Sensitive Hash Functions [14.347610075713412]
We show that neural representations can be viewed as LSH-like functions that map each input to an embedding. An important consequence of this behavior is one-shot learning to unseen classes.
arXiv Detail & Related papers (2021-03-11T18:57:47Z)
Agnostic Learning of a Single Neuron with Gradient Descent [92.7662890047311]
We consider the problem of learning the best-fitting single neuron as measured by the expected square loss. For the ReLU activation, our population risk guarantee is $O(mathsfOPT1/2)+epsilon$. For the ReLU activation, our population risk guarantee is $O(mathsfOPT1/2)+epsilon$.
arXiv Detail & Related papers (2020-05-29T07:20:35Z)

This list is automatically generated from the titles and abstracts of the papers in this site.