From the Expectation Maximisation Algorithm to Autoencoded Variational
Bayes
- URL: http://arxiv.org/abs/2010.13551v2
- Date: Tue, 4 May 2021 07:33:28 GMT
- Title: From the Expectation Maximisation Algorithm to Autoencoded Variational
Bayes
- Authors: Graham W. Pulford
- Abstract summary: We first give a tutorial presentation of the EM algorithm for estimating the parameters of a $K$-component mixture density.
In a similar style to Bishop's 2009 book, we present variational Bayesian inference as a generalised EM algorithm.
We establish clear links between the EM algorithm and its variational counterparts, hence clarifying the meaning of "latent variables"
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Although the expectation maximisation (EM) algorithm was introduced in 1970,
it remains somewhat inaccessible to machine learning practitioners due to its
obscure notation, terse proofs and lack of concrete links to modern machine
learning techniques like autoencoded variational Bayes. This has resulted in
gaps in the AI literature concerning the meaning of such concepts like "latent
variables" and "variational lower bound," which are frequently used but often
not clearly explained. The roots of these ideas lie in the EM algorithm. We
first give a tutorial presentation of the EM algorithm for estimating the
parameters of a $K$-component mixture density. The Gaussian mixture case is
presented in detail using $K$-ary scalar hidden (or latent) variables rather
than the more traditional binary valued $K$-dimenional vectors. This
presentation is motivated by mixture modelling from the target tracking
literature. In a similar style to Bishop's 2009 book, we present variational
Bayesian inference as a generalised EM algorithm stemming from the variational
(or evidential) lower bound, as well as the technique of mean field
approximation (or product density transform). We continue the evolution from EM
to variational autoencoders, developed by Kingma & Welling in 2014. In so
doing, we establish clear links between the EM algorithm and its variational
counterparts, hence clarifying the meaning of "latent variables." We provide a
detailed coverage of the "reparametrisation trick" and focus on how the AEVB
differs from conventional variational Bayesian inference. Throughout the
tutorial, consistent notational conventions are used. This unifies the
narrative and clarifies the concepts. Some numerical examples are given to
further illustrate the algorithms.
Related papers
- Batch, match, and patch: low-rank approximations for score-based variational inference [8.840147522046651]
Black-box variational inference scales poorly to high dimensional problems.
We extend the batch-and-match framework for score-based BBVI.
We evaluate this approach on a variety of synthetic target distributions and real-world problems in high-dimensional inference.
arXiv Detail & Related papers (2024-10-29T17:42:56Z) - Byzantine-Resilient Learning Beyond Gradients: Distributing Evolutionary
Search [6.461473289206789]
We show that gradient-free ML algorithms can be combined with classical distributed consensus algorithms to generate gradient-free byzantine-resilient distributed learning algorithms.
We provide proofs and pseudo-code for two specific cases - the Total Order Broadcast and proof-of-work leader election.
arXiv Detail & Related papers (2023-04-20T17:13:29Z) - Equivalence Between SE(3) Equivariant Networks via Steerable Kernels and
Group Convolution [90.67482899242093]
A wide range of techniques have been proposed in recent years for designing neural networks for 3D data that are equivariant under rotation and translation of the input.
We provide an in-depth analysis of both methods and their equivalence and relate the two constructions to multiview convolutional networks.
We also derive new TFN non-linearities from our equivalence principle and test them on practical benchmark datasets.
arXiv Detail & Related papers (2022-11-29T03:42:11Z) - Learning Shared Kernel Models: the Shared Kernel EM algorithm [0.0]
Expectation maximisation (EM) is an unsupervised learning method for estimating the parameters of a finite mixture distribution.
We first present a rederivation of the standard EM algorithm using data association ideas from the field of multiple target tracking.
The same method is then applied to a little known but much more general type of supervised EM algorithm for shared kernel models.
arXiv Detail & Related papers (2022-05-15T10:10:08Z) - Equivariance Discovery by Learned Parameter-Sharing [153.41877129746223]
We study how to discover interpretable equivariances from data.
Specifically, we formulate this discovery process as an optimization problem over a model's parameter-sharing schemes.
Also, we theoretically analyze the method for Gaussian data and provide a bound on the mean squared gap between the studied discovery scheme and the oracle scheme.
arXiv Detail & Related papers (2022-04-07T17:59:19Z) - Correcting Momentum with Second-order Information [50.992629498861724]
We develop a new algorithm for non-critical optimization that finds an $O(epsilon)$epsilon point in the optimal product.
We validate our results on a variety of large-scale deep learning benchmarks and architectures.
arXiv Detail & Related papers (2021-03-04T19:01:20Z) - Autoencoding Variational Autoencoder [56.05008520271406]
We study the implications of this behaviour on the learned representations and also the consequences of fixing it by introducing a notion of self consistency.
We show that encoders trained with our self-consistency approach lead to representations that are robust (insensitive) to perturbations in the input introduced by adversarial attacks.
arXiv Detail & Related papers (2020-12-07T14:16:14Z) - Information Theoretic Meta Learning with Gaussian Processes [74.54485310507336]
We formulate meta learning using information theoretic concepts; namely, mutual information and the information bottleneck.
By making use of variational approximations to the mutual information, we derive a general and tractable framework for meta learning.
arXiv Detail & Related papers (2020-09-07T16:47:30Z) - The FMRIB Variational Bayesian Inference Tutorial II: Stochastic
Variational Bayes [1.827510863075184]
This tutorial revisits the original FMRIB Variational Bayes tutorial.
This new approach bears a lot of similarity to, and has benefited from, computational methods applied to machine learning algorithms.
arXiv Detail & Related papers (2020-07-03T11:31:52Z) - Preventing Posterior Collapse with Levenshtein Variational Autoencoder [61.30283661804425]
We propose to replace the evidence lower bound (ELBO) with a new objective which is simple to optimize and prevents posterior collapse.
We show that Levenstein VAE produces more informative latent representations than alternative approaches to preventing posterior collapse.
arXiv Detail & Related papers (2020-04-30T13:27:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.