Disentangled Interleaving Variational Encoding
- URL: http://arxiv.org/abs/2501.08710v2
- Date: Thu, 16 Jan 2025 09:07:00 GMT
- Title: Disentangled Interleaving Variational Encoding
- Authors: Noelle Y. L. Wong, Eng Yeow Cheu, Zhonglin Chiam, Dipti Srinivasan,
- Abstract summary: We propose a principled approach to disentangle the original input into marginal and conditional probability distributions in the latent space of a variational autoencoder.
Our proposed model, Deep Disentangled Interleaving Variational.
coder (DeepDIVE), learns disentangled features from the original input to form clusters in the embedding space.
Experiments on two public datasets show that DeepDIVE disentangles the original input and yields forecast accuracies better than the original VAE.
- Score: 1.132458063021286
- License:
- Abstract: Conflicting objectives present a considerable challenge in interleaving multi-task learning, necessitating the need for meticulous design and balance to ensure effective learning of a representative latent data space across all tasks without mutual negative impact. Drawing inspiration from the concept of marginal and conditional probability distributions in probability theory, we design a principled and well-founded approach to disentangle the original input into marginal and conditional probability distributions in the latent space of a variational autoencoder. Our proposed model, Deep Disentangled Interleaving Variational Encoding (DeepDIVE) learns disentangled features from the original input to form clusters in the embedding space and unifies these features via the cross-attention mechanism in the fusion stage. We theoretically prove that combining the objectives for reconstruction and forecasting fully captures the lower bound and mathematically derive a loss function for disentanglement using Na\"ive Bayes. Under the assumption that the prior is a mixture of log-concave distributions, we also establish that the Kullback-Leibler divergence between the prior and the posterior is upper bounded by a function minimized by the minimizer of the cross entropy loss, informing our adoption of radial basis functions (RBF) and cross entropy with interleaving training for DeepDIVE to provide a justified basis for convergence. Experiments on two public datasets show that DeepDIVE disentangles the original input and yields forecast accuracies better than the original VAE and comparable to existing state-of-the-art baselines.
Related papers
- Distributed Markov Chain Monte Carlo Sampling based on the Alternating
Direction Method of Multipliers [143.6249073384419]
In this paper, we propose a distributed sampling scheme based on the alternating direction method of multipliers.
We provide both theoretical guarantees of our algorithm's convergence and experimental evidence of its superiority to the state-of-the-art.
In simulation, we deploy our algorithm on linear and logistic regression tasks and illustrate its fast convergence compared to existing gradient-based methods.
arXiv Detail & Related papers (2024-01-29T02:08:40Z) - Distributed Variational Inference for Online Supervised Learning [15.038649101409804]
This paper develops a scalable distributed probabilistic inference algorithm.
It applies to continuous variables, intractable posteriors and large-scale real-time data in sensor networks.
arXiv Detail & Related papers (2023-09-05T22:33:02Z) - Reliable amortized variational inference with physics-based latent
distribution correction [0.4588028371034407]
A neural network is trained to approximate the posterior distribution over existing pairs of model and data.
The accuracy of this approach relies on the availability of high-fidelity training data.
We show that our correction step improves the robustness of amortized variational inference with respect to changes in number of source experiments, noise variance, and shifts in the prior distribution.
arXiv Detail & Related papers (2022-07-24T02:38:54Z) - Toward Certified Robustness Against Real-World Distribution Shifts [65.66374339500025]
We train a generative model to learn perturbations from data and define specifications with respect to the output of the learned model.
A unique challenge arising from this setting is that existing verifiers cannot tightly approximate sigmoid activations.
We propose a general meta-algorithm for handling sigmoid activations which leverages classical notions of counter-example-guided abstraction refinement.
arXiv Detail & Related papers (2022-06-08T04:09:13Z) - Distribution Regression with Sliced Wasserstein Kernels [45.916342378789174]
We propose the first OT-based estimator for distribution regression.
We study the theoretical properties of a kernel ridge regression estimator based on such representation.
arXiv Detail & Related papers (2022-02-08T15:21:56Z) - Robust Estimation for Nonparametric Families via Generative Adversarial
Networks [92.64483100338724]
We provide a framework for designing Generative Adversarial Networks (GANs) to solve high dimensional robust statistics problems.
Our work extend these to robust mean estimation, second moment estimation, and robust linear regression.
In terms of techniques, our proposed GAN losses can be viewed as a smoothed and generalized Kolmogorov-Smirnov distance.
arXiv Detail & Related papers (2022-02-02T20:11:33Z) - Regularizing Variational Autoencoder with Diversity and Uncertainty
Awareness [61.827054365139645]
Variational Autoencoder (VAE) approximates the posterior of latent variables based on amortized variational inference.
We propose an alternative model, DU-VAE, for learning a more Diverse and less Uncertain latent space.
arXiv Detail & Related papers (2021-10-24T07:58:13Z) - Low-rank Characteristic Tensor Density Estimation Part II: Compression
and Latent Density Estimation [31.631861197477185]
Learning generative probabilistic models is a core problem in machine learning.
This paper proposes a joint dimensionality reduction and non-parametric density estimation framework.
We demonstrate that the proposed model achieves very promising results on regression tasks, sampling, and anomaly detection.
arXiv Detail & Related papers (2021-06-20T00:38:56Z) - General stochastic separation theorems with optimal bounds [68.8204255655161]
Phenomenon of separability was revealed and used in machine learning to correct errors of Artificial Intelligence (AI) systems and analyze AI instabilities.
Errors or clusters of errors can be separated from the rest of the data.
The ability to correct an AI system also opens up the possibility of an attack on it, and the high dimensionality induces vulnerabilities caused by the same separability.
arXiv Detail & Related papers (2020-10-11T13:12:41Z) - Generative Model without Prior Distribution Matching [26.91643368299913]
Variational Autoencoder (VAE) and its variations are classic generative models by learning a low-dimensional latent representation to satisfy some prior distribution.
We propose to let the prior match the embedding distribution rather than imposing the latent variables to fit the prior.
arXiv Detail & Related papers (2020-09-23T09:33:24Z) - Bayesian Deep Learning and a Probabilistic Perspective of Generalization [56.69671152009899]
We show that deep ensembles provide an effective mechanism for approximate Bayesian marginalization.
We also propose a related approach that further improves the predictive distribution by marginalizing within basins of attraction.
arXiv Detail & Related papers (2020-02-20T15:13:27Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.