Cauchy-Schwarz Regularized Autoencoder
- URL: http://arxiv.org/abs/2101.02149v2
- Date: Fri, 12 Feb 2021 18:47:39 GMT
- Title: Cauchy-Schwarz Regularized Autoencoder
- Authors: Linh Tran, Maja Pantic, Marc Peter Deisenroth
- Abstract summary: Variational autoencoders (VAE) are a powerful and widely-used class of generative models.
We introduce a new constrained objective based on the Cauchy-Schwarz divergence, which can be computed analytically for GMMs.
Our objective improves upon variational auto-encoding models in density estimation, unsupervised clustering, semi-supervised learning, and face analysis.
- Score: 68.80569889599434
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Recent work in unsupervised learning has focused on efficient inference and
learning in latent variables models. Training these models by maximizing the
evidence (marginal likelihood) is typically intractable. Thus, a common
approximation is to maximize the Evidence Lower BOund (ELBO) instead.
Variational autoencoders (VAE) are a powerful and widely-used class of
generative models that optimize the ELBO efficiently for large datasets.
However, the VAE's default Gaussian choice for the prior imposes a strong
constraint on its ability to represent the true posterior, thereby degrading
overall performance. A Gaussian mixture model (GMM) would be a richer prior,
but cannot be handled efficiently within the VAE framework because of the
intractability of the Kullback-Leibler divergence for GMMs. We deviate from the
common VAE framework in favor of one with an analytical solution for Gaussian
mixture prior. To perform efficient inference for GMM priors, we introduce a
new constrained objective based on the Cauchy-Schwarz divergence, which can be
computed analytically for GMMs. This new objective allows us to incorporate
richer, multi-modal priors into the autoencoding framework. We provide
empirical studies on a range of datasets and show that our objective improves
upon variational auto-encoding models in density estimation, unsupervised
clustering, semi-supervised learning, and face analysis.
Related papers
- Optimizing Attention with Mirror Descent: Generalized Max-Margin Token Selection [6.759148939470332]
We show that algorithms converge in to a hard-margin SVM with an $ell_p$-norm objective.
Specifically, we show that these algorithms converge in to a generalized hard-margin SVM with an $ell_p$-norm objective.
arXiv Detail & Related papers (2024-10-18T16:32:06Z) - Model Inversion Attacks Through Target-Specific Conditional Diffusion Models [54.69008212790426]
Model attacks (MIAs) aim to reconstruct private images from a target classifier's training set, thereby raising privacy concerns in AI applications.
Previous GAN-based MIAs tend to suffer from inferior generative fidelity due to GAN's inherent flaws and biased optimization within latent space.
We propose Diffusion-based Model Inversion (Diff-MI) attacks to alleviate these issues.
arXiv Detail & Related papers (2024-07-16T06:38:49Z) - ClusterDDPM: An EM clustering framework with Denoising Diffusion
Probabilistic Models [9.91610928326645]
Denoising diffusion probabilistic models (DDPMs) represent a new and promising class of generative models.
In this study, we introduce an innovative expectation-maximization (EM) framework for clustering using DDPMs.
In the M-step, our focus lies in learning clustering-friendly latent representations for the data by employing the conditional DDPM and matching the distribution of latent representations to the mixture of Gaussian priors.
arXiv Detail & Related papers (2023-12-13T10:04:06Z) - An Efficient 1 Iteration Learning Algorithm for Gaussian Mixture Model
And Gaussian Mixture Embedding For Neural Network [2.261786383673667]
The new algorithm brings more robustness and simplicity than classic Expectation Maximization (EM) algorithm.
It also improves the accuracy and only take 1 iteration for learning.
arXiv Detail & Related papers (2023-08-18T10:17:59Z) - Personalized Federated Learning under Mixture of Distributions [98.25444470990107]
We propose a novel approach to Personalized Federated Learning (PFL), which utilizes Gaussian mixture models (GMM) to fit the input data distributions across diverse clients.
FedGMM possesses an additional advantage of adapting to new clients with minimal overhead, and it also enables uncertainty quantification.
Empirical evaluations on synthetic and benchmark datasets demonstrate the superior performance of our method in both PFL classification and novel sample detection.
arXiv Detail & Related papers (2023-05-01T20:04:46Z) - A distribution-free mixed-integer optimization approach to hierarchical modelling of clustered and longitudinal data [0.0]
We introduce an innovative algorithm that evaluates cluster effects for new data points, thereby increasing the robustness and precision of this model.
The inferential and predictive efficacy of this approach is further illustrated through its application in student scoring and protein expression.
arXiv Detail & Related papers (2023-02-06T23:34:51Z) - On the failure of variational score matching for VAE models [3.8073142980733]
We present a critical study of existing variational SM objectives, showing catastrophic failure on a wide range of datasets and network architectures.
Our theoretical insights on the objectives emerge directly from their equivalent autoencoding losses when optimizing variational autoencoder (VAE) models.
arXiv Detail & Related papers (2022-10-24T16:43:04Z) - Continual Learning with Fully Probabilistic Models [70.3497683558609]
We present an approach for continual learning based on fully probabilistic (or generative) models of machine learning.
We propose a pseudo-rehearsal approach using a Gaussian Mixture Model (GMM) instance for both generator and classifier functionalities.
We show that GMR achieves state-of-the-art performance on common class-incremental learning problems at very competitive time and memory complexity.
arXiv Detail & Related papers (2021-04-19T12:26:26Z) - Understanding Overparameterization in Generative Adversarial Networks [56.57403335510056]
Generative Adversarial Networks (GANs) are used to train non- concave mini-max optimization problems.
A theory has shown the importance of the gradient descent (GD) to globally optimal solutions.
We show that in an overized GAN with a $1$-layer neural network generator and a linear discriminator, the GDA converges to a global saddle point of the underlying non- concave min-max problem.
arXiv Detail & Related papers (2021-04-12T16:23:37Z) - Generalized Matrix Factorization: efficient algorithms for fitting
generalized linear latent variable models to large data arrays [62.997667081978825]
Generalized Linear Latent Variable models (GLLVMs) generalize such factor models to non-Gaussian responses.
Current algorithms for estimating model parameters in GLLVMs require intensive computation and do not scale to large datasets.
We propose a new approach for fitting GLLVMs to high-dimensional datasets, based on approximating the model using penalized quasi-likelihood.
arXiv Detail & Related papers (2020-10-06T04:28:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.