Joint-stochastic-approximation Autoencoders with Application to Semi-supervised Learning
- URL: http://arxiv.org/abs/2505.18558v1
- Date: Sat, 24 May 2025 06:52:23 GMT
- Title: Joint-stochastic-approximation Autoencoders with Application to Semi-supervised Learning
- Authors: Wenbo He, Zhijian Ou,
- Abstract summary: We present Joint-stochastic-approximation (JSA) autoencoders - a new family of algorithms for building deep directed generative models.<n> JSA learning algorithm directly maximizes the data log-likelihood and simultaneously minimizes the inclusive KL divergence between the posteriori and the inference model.<n>We empirically show that JSA autoencoders with discrete latent space achieve comparable performance to other state-of-the-art DGMs with continuous latent space in semi-supervised tasks.
- Score: 16.625057220045292
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Our examination of existing deep generative models (DGMs), including VAEs and GANs, reveals two problems. First, their capability in handling discrete observations and latent codes is unsatisfactory, though there are interesting efforts. Second, both VAEs and GANs optimize some criteria that are indirectly related to the data likelihood. To address these problems, we formally present Joint-stochastic-approximation (JSA) autoencoders - a new family of algorithms for building deep directed generative models, with application to semi-supervised learning. The JSA learning algorithm directly maximizes the data log-likelihood and simultaneously minimizes the inclusive KL divergence the between the posteriori and the inference model. We provide theoretical results and conduct a series of experiments to show its superiority such as being robust to structure mismatch between encoder and decoder, consistent handling of both discrete and continuous variables. Particularly we empirically show that JSA autoencoders with discrete latent space achieve comparable performance to other state-of-the-art DGMs with continuous latent space in semi-supervised tasks over the widely adopted datasets - MNIST and SVHN. To the best of our knowledge, this is the first demonstration that discrete latent variable models are successfully applied in the challenging semi-supervised tasks.
Related papers
- Joint-stochastic-approximation Random Fields with Application to Semi-supervised Learning [10.293017518216908]
We present joint-stochastic-approximation random fields (JRFs) -- a new family of algorithms for building deep undirected generative models.<n>It is found through synthetic experiments that JRFs work well in balancing mode covering and mode missing, and match the empirical data distribution well.
arXiv Detail & Related papers (2025-05-24T07:04:32Z) - Complexity Matters: Rethinking the Latent Space for Generative Modeling [65.64763873078114]
In generative modeling, numerous successful approaches leverage a low-dimensional latent space, e.g., Stable Diffusion.
In this study, we aim to shed light on this under-explored topic by rethinking the latent space from the perspective of model complexity.
arXiv Detail & Related papers (2023-07-17T07:12:29Z) - Language as a Latent Sequence: deep latent variable models for
semi-supervised paraphrase generation [47.33223015862104]
We present a novel unsupervised model named variational sequence auto-encoding reconstruction (VSAR), which performs latent sequence inference given an observed text.
To leverage information from text pairs, we additionally introduce a novel supervised model we call dual directional learning (DDL), which is designed to integrate with our proposed VSAR model.
Our empirical evaluations suggest that the combined model yields competitive performance against the state-of-the-art supervised baselines on complete data.
arXiv Detail & Related papers (2023-01-05T19:35:30Z) - Advancing Semi-Supervised Task Oriented Dialog Systems by JSA Learning
of Discrete Latent Variable Models [22.249113574918034]
JSA-TOD represents the first work in developing JSA based semi-supervised learning of discrete latent variable conditional models.
Experiments show that JSA-TOD significantly outperforms its variational learning counterpart.
arXiv Detail & Related papers (2022-07-25T14:36:10Z) - Learning Mixtures of Linear Dynamical Systems [94.49754087817931]
We develop a two-stage meta-algorithm to efficiently recover each ground-truth LDS model up to error $tildeO(sqrtd/T)$.
We validate our theoretical studies with numerical experiments, confirming the efficacy of the proposed algorithm.
arXiv Detail & Related papers (2022-01-26T22:26:01Z) - Continual Learning with Fully Probabilistic Models [70.3497683558609]
We present an approach for continual learning based on fully probabilistic (or generative) models of machine learning.
We propose a pseudo-rehearsal approach using a Gaussian Mixture Model (GMM) instance for both generator and classifier functionalities.
We show that GMR achieves state-of-the-art performance on common class-incremental learning problems at very competitive time and memory complexity.
arXiv Detail & Related papers (2021-04-19T12:26:26Z) - Differential Privacy and Byzantine Resilience in SGD: Do They Add Up? [6.614755043607777]
We study whether a distributed implementation of the renowned Gradient Descent (SGD) learning algorithm is feasible with both differential privacy (DP) and $(alpha,f)$-Byzantine resilience.
We show that a direct composition of these techniques makes the guarantees of the resulting SGD algorithm depend unfavourably upon the number of parameters in the ML model.
arXiv Detail & Related papers (2021-02-16T14:10:38Z) - Cauchy-Schwarz Regularized Autoencoder [68.80569889599434]
Variational autoencoders (VAE) are a powerful and widely-used class of generative models.
We introduce a new constrained objective based on the Cauchy-Schwarz divergence, which can be computed analytically for GMMs.
Our objective improves upon variational auto-encoding models in density estimation, unsupervised clustering, semi-supervised learning, and face analysis.
arXiv Detail & Related papers (2021-01-06T17:36:26Z) - Recent Developments Combining Ensemble Smoother and Deep Generative
Networks for Facies History Matching [58.720142291102135]
This research project focuses on the use of autoencoders networks to construct a continuous parameterization for facies models.
We benchmark seven different formulations, including VAE, generative adversarial network (GAN), Wasserstein GAN, variational auto-encoding GAN, principal component analysis (PCA) with cycle GAN, PCA with transfer style network and VAE with style loss.
arXiv Detail & Related papers (2020-05-08T21:32:42Z) - SUOD: Accelerating Large-Scale Unsupervised Heterogeneous Outlier
Detection [63.253850875265115]
Outlier detection (OD) is a key machine learning (ML) task for identifying abnormal objects from general samples.
We propose a modular acceleration system, called SUOD, to address it.
arXiv Detail & Related papers (2020-03-11T00:22:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.