$t^3$-Variational Autoencoder: Learning Heavy-tailed Data with Student's
t and Power Divergence
- URL: http://arxiv.org/abs/2312.01133v2
- Date: Sun, 3 Mar 2024 08:58:36 GMT
- Title: $t^3$-Variational Autoencoder: Learning Heavy-tailed Data with Student's
t and Power Divergence
- Authors: Juno Kim, Jaehyuk Kwon, Mincheol Cho, Hyunjong Lee, Joong-Ho Won
- Abstract summary: $t3$VAE is a modified VAE framework that incorporates Student's t-distributions for the prior, encoder, and decoder.
We show that $t3$VAE significantly outperforms other models on CelebA and imbalanced CIFAR-100 datasets.
- Score: 7.0479532872043755
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The variational autoencoder (VAE) typically employs a standard normal prior
as a regularizer for the probabilistic latent encoder. However, the Gaussian
tail often decays too quickly to effectively accommodate the encoded points,
failing to preserve crucial structures hidden in the data. In this paper, we
explore the use of heavy-tailed models to combat over-regularization. Drawing
upon insights from information geometry, we propose $t^3$VAE, a modified VAE
framework that incorporates Student's t-distributions for the prior, encoder,
and decoder. This results in a joint model distribution of a power form which
we argue can better fit real-world datasets. We derive a new objective by
reformulating the evidence lower bound as joint optimization of KL divergence
between two statistical manifolds and replacing with $\gamma$-power divergence,
a natural alternative for power families. $t^3$VAE demonstrates superior
generation of low-density regions when trained on heavy-tailed synthetic data.
Furthermore, we show that $t^3$VAE significantly outperforms other models on
CelebA and imbalanced CIFAR-100 datasets.
Related papers
- Wasserstein Distributionally Robust Multiclass Support Vector Machine [1.8570591025615457]
We study the problem of multiclass classification for settings where data features $mathbfx$ and their labels $mathbfy$ are uncertain.
We use Wasserstein distributionally robust optimization to develop a robust version of the multiclass support vector machine (SVM) characterized by the Crammer-Singer (CS) loss.
Our numerical experiments demonstrate that our model outperforms state-of-the art OVA models in settings where the training data is highly imbalanced.
arXiv Detail & Related papers (2024-09-12T21:40:04Z) - Robust Reinforcement Learning from Corrupted Human Feedback [86.17030012828003]
Reinforcement learning from human feedback (RLHF) provides a principled framework for aligning AI systems with human preference data.
We propose a robust RLHF approach -- $R3M$, which models the potentially corrupted preference label as sparse outliers.
Our experiments on robotic control and natural language generation with large language models (LLMs) show that $R3M$ improves robustness of the reward against several types of perturbations to the preference data.
arXiv Detail & Related papers (2024-06-21T18:06:30Z) - Machine Learning Force Fields with Data Cost Aware Training [94.78998399180519]
Machine learning force fields (MLFF) have been proposed to accelerate molecular dynamics (MD) simulation.
Even for the most data-efficient MLFFs, reaching chemical accuracy can require hundreds of frames of force and energy labels.
We propose a multi-stage computational framework -- ASTEROID, which lowers the data cost of MLFFs by leveraging a combination of cheap inaccurate data and expensive accurate data.
arXiv Detail & Related papers (2023-06-05T04:34:54Z) - Disentanglement via Latent Quantization [60.37109712033694]
In this work, we construct an inductive bias towards encoding to and decoding from an organized latent space.
We demonstrate the broad applicability of this approach by adding it to both basic data-re (vanilla autoencoder) and latent-reconstructing (InfoGAN) generative models.
arXiv Detail & Related papers (2023-05-28T06:30:29Z) - Causal Recurrent Variational Autoencoder for Medical Time Series
Generation [12.82521953179345]
We propose causal recurrent variational autoencoder (CR-VAE), a novel generative model that learns a Granger causal graph from a time series x.
Our model consistently outperforms state-of-the-art time series generative models both qualitatively and quantitatively.
arXiv Detail & Related papers (2023-01-16T19:13:33Z) - Training \beta-VAE by Aggregating a Learned Gaussian Posterior with a
Decoupled Decoder [0.553073476964056]
Current practices in VAE training often result in a trade-off between the reconstruction fidelity and the continuity$/$disentanglement of the latent space.
We present intuitions and a careful analysis of the antagonistic mechanism of the two losses, and propose a simple yet effective two-stage method for training a VAE.
We evaluate the method using a medical dataset intended for 3D skull reconstruction and shape completion, and the results indicate promising generative capabilities of the VAE trained using the proposed method.
arXiv Detail & Related papers (2022-09-29T13:49:57Z) - Few-Shot Non-Parametric Learning with Deep Latent Variable Model [50.746273235463754]
We propose Non-Parametric learning by Compression with Latent Variables (NPC-LV)
NPC-LV is a learning framework for any dataset with abundant unlabeled data but very few labeled ones.
We show that NPC-LV outperforms supervised methods on all three datasets on image classification in low data regime.
arXiv Detail & Related papers (2022-06-23T09:35:03Z) - uGLAD: Sparse graph recovery by optimizing deep unrolled networks [11.48281545083889]
We present a novel technique to perform sparse graph recovery by optimizing deep unrolled networks.
Our model, uGLAD, builds upon and extends the state-of-the-art model GLAD to the unsupervised setting.
We evaluate model results on synthetic Gaussian data, non-Gaussian data generated from Gene Regulatory Networks, and present a case study in anaerobic digestion.
arXiv Detail & Related papers (2022-05-23T20:20:27Z) - Regularizing Variational Autoencoder with Diversity and Uncertainty
Awareness [61.827054365139645]
Variational Autoencoder (VAE) approximates the posterior of latent variables based on amortized variational inference.
We propose an alternative model, DU-VAE, for learning a more Diverse and less Uncertain latent space.
arXiv Detail & Related papers (2021-10-24T07:58:13Z) - Cauchy-Schwarz Regularized Autoencoder [68.80569889599434]
Variational autoencoders (VAE) are a powerful and widely-used class of generative models.
We introduce a new constrained objective based on the Cauchy-Schwarz divergence, which can be computed analytically for GMMs.
Our objective improves upon variational auto-encoding models in density estimation, unsupervised clustering, semi-supervised learning, and face analysis.
arXiv Detail & Related papers (2021-01-06T17:36:26Z) - To Regularize or Not To Regularize? The Bias Variance Trade-off in
Regularized AEs [10.611727286504994]
We study the effect of the latent prior on the generation deterministic quality of AE models.
We show that our model, called FlexAE, is the new state-of-the-art for the AE based generative models.
arXiv Detail & Related papers (2020-06-10T14:00:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.