Related papers: Training \beta-VAE by Aggregating a Learned Gaussian Posterior with a Decoupled Decoder

Training \beta-VAE by Aggregating a Learned Gaussian Posterior with a Decoupled Decoder

URL: http://arxiv.org/abs/2209.14783v1
Date: Thu, 29 Sep 2022 13:49:57 GMT
Title: Training \beta-VAE by Aggregating a Learned Gaussian Posterior with a Decoupled Decoder
Authors: Jianning Li, Jana Fragemann, Seyed-Ahmad Ahmadi, Jens Kleesiek, and Jan Egger
Abstract summary: Current practices in VAE training often result in a trade-off between the reconstruction fidelity and the continuity$/$disentanglement of the latent space. We present intuitions and a careful analysis of the antagonistic mechanism of the two losses, and propose a simple yet effective two-stage method for training a VAE. We evaluate the method using a medical dataset intended for 3D skull reconstruction and shape completion, and the results indicate promising generative capabilities of the VAE trained using the proposed method.
Score: 0.553073476964056
License: http://creativecommons.org/licenses/by-nc-sa/4.0/
Abstract: The reconstruction loss and the Kullback-Leibler divergence (KLD) loss in a variational autoencoder (VAE) often play antagonistic roles, and tuning the weight of the KLD loss in $\beta$-VAE to achieve a balance between the two losses is a tricky and dataset-specific task. As a result, current practices in VAE training often result in a trade-off between the reconstruction fidelity and the continuity$/$disentanglement of the latent space, if the weight $\beta$ is not carefully tuned. In this paper, we present intuitions and a careful analysis of the antagonistic mechanism of the two losses, and propose, based on the insights, a simple yet effective two-stage method for training a VAE. Specifically, the method aggregates a learned Gaussian posterior $z \sim q_{\theta} (z|x)$ with a decoder decoupled from the KLD loss, which is trained to learn a new conditional distribution $p_{\phi} (x|z)$ of the input data $x$. Experimentally, we show that the aggregated VAE maximally satisfies the Gaussian assumption about the latent space, while still achieves a reconstruction error comparable to when the latent space is only loosely regularized by $\mathcal{N}(\mathbf{0},I)$. The proposed approach does not require hyperparameter (i.e., the KLD weight $\beta$) tuning given a specific dataset as required in common VAE training practices. We evaluate the method using a medical dataset intended for 3D skull reconstruction and shape completion, and the results indicate promising generative capabilities of the VAE trained using the proposed method. Besides, through guided manipulation of the latent variables, we establish a connection between existing autoencoder (AE)-based approaches and generative approaches, such as VAE, for the shape completion problem. Codes and pre-trained weights are available at https://github.com/Jianningli/skullVAE

Related papers

Statistical-Computational Trade-offs for Recursive Adaptive Partitioning Estimators [23.056208049082134]
We show that greedy algorithms for high-dimensional regression get stuck at local optima. We show that greedy training requires $exp(Omega(d))$ to achieve low estimation error. This dichotomy mirrors that of two-layer neural networks trained with gradient descent (SGD) in the mean-field regime.
arXiv Detail & Related papers (2024-11-07T03:11:53Z)
Rejection via Learning Density Ratios [50.91522897152437]
Classification with rejection emerges as a learning paradigm which allows models to abstain from making predictions. We propose a different distributional perspective, where we seek to find an idealized data distribution which maximizes a pretrained model's performance. Our framework is tested empirically over clean and noisy datasets.
arXiv Detail & Related papers (2024-05-29T01:32:17Z)
$t^3$-Variational Autoencoder: Learning Heavy-tailed Data with Student's t and Power Divergence [7.0479532872043755]
$t3$VAE is a modified VAE framework that incorporates Student's t-distributions for the prior, encoder, and decoder. We show that $t3$VAE significantly outperforms other models on CelebA and imbalanced CIFAR-100 datasets.
arXiv Detail & Related papers (2023-12-02T13:14:28Z)
Matching aggregate posteriors in the variational autoencoder [0.5759862457142761]
The variational autoencoder (VAE) is a well-studied, deep, latent-variable model (DLVM) This paper addresses shortcomings in VAEs by reformulating the objective function associated with VAEs in order to match the aggregate/marginal posterior distribution to the prior. The proposed method is named the emphaggregate variational autoencoder (AVAE) and is built on the theoretical framework of the VAE.
arXiv Detail & Related papers (2023-11-13T19:22:37Z)
Addressing GAN Training Instabilities via Tunable Classification Losses [8.151943266391493]
Generative adversarial networks (GANs) allow generating synthetic data with formal guarantees. We show that all symmetric $f$-divergences are equivalent in convergence. We also highlight the value of tuning $(alpha_D,alpha_G)$ in alleviating training instabilities for the synthetic 2D Gaussian mixture ring.
arXiv Detail & Related papers (2023-10-27T17:29:07Z)
Learning to Re-weight Examples with Optimal Transport for Imbalanced Classification [74.62203971625173]
Imbalanced data pose challenges for deep learning based classification models. One of the most widely-used approaches for tackling imbalanced data is re-weighting. We propose a novel re-weighting method based on optimal transport (OT) from a distributional point of view.
arXiv Detail & Related papers (2022-08-05T01:23:54Z)
Bounding the Width of Neural Networks via Coupled Initialization -- A Worst Case Analysis [121.9821494461427]
We show how to significantly reduce the number of neurons required for two-layer ReLU networks. We also prove new lower bounds that improve upon prior work, and that under certain assumptions, are best possible.
arXiv Detail & Related papers (2022-06-26T06:51:31Z)
Provably Efficient Offline Reinforcement Learning with Trajectory-Wise Reward [66.81579829897392]
We propose a novel offline reinforcement learning algorithm called Pessimistic vAlue iteRaTion with rEward Decomposition (PARTED) PARTED decomposes the trajectory return into per-step proxy rewards via least-squares-based reward redistribution, and then performs pessimistic value based on the learned proxy reward. To the best of our knowledge, PARTED is the first offline RL algorithm that is provably efficient in general MDP with trajectory-wise reward.
arXiv Detail & Related papers (2022-06-13T19:11:22Z)
Anti-Concentrated Confidence Bonuses for Scalable Exploration [57.91943847134011]
Intrinsic rewards play a central role in handling the exploration-exploitation trade-off. We introduce emphanti-concentrated confidence bounds for efficiently approximating the elliptical bonus. We develop a practical variant for deep reinforcement learning that is competitive with contemporary intrinsic rewards on Atari benchmarks.
arXiv Detail & Related papers (2021-10-21T15:25:15Z)
DynamicVAE: Decoupling Reconstruction Error and Disentangled Representation Learning [15.317044259237043]
This paper challenges the common assumption that the weight $beta$, in $beta$-VAE, should be larger than $1$ in order to effectively disentangle latent factors. We demonstrate that $beta$-VAE, with $beta 1$, can not only attain good disentanglement but also significantly improve reconstruction accuracy via dynamic control.
arXiv Detail & Related papers (2020-09-15T00:01:11Z)
Preventing Posterior Collapse with Levenshtein Variational Autoencoder [61.30283661804425]
We propose to replace the evidence lower bound (ELBO) with a new objective which is simple to optimize and prevents posterior collapse. We show that Levenstein VAE produces more informative latent representations than alternative approaches to preventing posterior collapse.
arXiv Detail & Related papers (2020-04-30T13:27:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.

This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.