Unpicking Data at the Seams: Understanding Disentanglement in VAEs
- URL: http://arxiv.org/abs/2410.22559v4
- Date: Thu, 06 Feb 2025 15:35:37 GMT
- Title: Unpicking Data at the Seams: Understanding Disentanglement in VAEs
- Authors: Carl Allen,
- Abstract summary: Disentanglement, or identifying statistically independent factors of the data, is relevant to much of machine learning.
Disentanglement arises in several generative paradigms including Variational Autoencoders (VAEs), Generative Adversarial Networks and diffusion models.
- Score: 5.05828899601167
- License:
- Abstract: Disentanglement, or identifying statistically independent factors of the data, is relevant to much of machine learning, from controlled data generation and robust classification to efficient encoding and improving our understanding of the data itself. Disentanglement arises in several generative paradigms including Variational Autoencoders (VAEs), Generative Adversarial Networks and diffusion models. Recent progress has been made in understanding disentanglement in VAEs, where a choice of diagonal posterior covariance matrices is shown to promote mutual orthogonality between columns of the decoder's Jacobian. We build on this to show how such orthogonality, a geometric property, translates to disentanglement, a statistical property, furthering our understanding of how a VAE identifies independent components of, or disentangles, the data.
Related papers
- Integrating Random Effects in Variational Autoencoders for Dimensionality Reduction of Correlated Data [9.990687944474738]
LMMVAE is a novel model which separates the classic VAE latent model into fixed and random parts.
It is shown to improve squared reconstruction error and negative likelihood loss significantly on unseen data.
arXiv Detail & Related papers (2024-12-22T07:20:17Z) - Disentanglement with Factor Quantized Variational Autoencoders [11.086500036180222]
We propose a discrete variational autoencoder (VAE) based model where the ground truth information about the generative factors are not provided to the model.
We demonstrate the advantages of learning discrete representations over learning continuous representations in facilitating disentanglement.
Our method called FactorQVAE combines optimization based disentanglement approaches with discrete representation learning.
arXiv Detail & Related papers (2024-09-23T09:33:53Z) - An improved tabular data generator with VAE-GMM integration [9.4491536689161]
We propose a novel Variational Autoencoder (VAE)-based model that addresses limitations of current approaches.
Inspired by the TVAE model, our approach incorporates a Bayesian Gaussian Mixture model (BGM) within the VAE architecture.
We thoroughly validate our model on three real-world datasets with mixed data types, including two medically relevant ones.
arXiv Detail & Related papers (2024-04-12T12:31:06Z) - Disentanglement via Latent Quantization [60.37109712033694]
In this work, we construct an inductive bias towards encoding to and decoding from an organized latent space.
We demonstrate the broad applicability of this approach by adding it to both basic data-re (vanilla autoencoder) and latent-reconstructing (InfoGAN) generative models.
arXiv Detail & Related papers (2023-05-28T06:30:29Z) - Equivariance Discovery by Learned Parameter-Sharing [153.41877129746223]
We study how to discover interpretable equivariances from data.
Specifically, we formulate this discovery process as an optimization problem over a model's parameter-sharing schemes.
Also, we theoretically analyze the method for Gaussian data and provide a bound on the mean squared gap between the studied discovery scheme and the oracle scheme.
arXiv Detail & Related papers (2022-04-07T17:59:19Z) - Is Disentanglement enough? On Latent Representations for Controllable
Music Generation [78.8942067357231]
In the absence of a strong generative decoder, disentanglement does not necessarily imply controllability.
The structure of the latent space with respect to the VAE-decoder plays an important role in boosting the ability of a generative model to manipulate different attributes.
arXiv Detail & Related papers (2021-08-01T18:37:43Z) - Multivariate Data Explanation by Jumping Emerging Patterns Visualization [78.6363825307044]
We present VAX (multiVariate dAta eXplanation), a new VA method to support the identification and visual interpretation of patterns in multivariate data sets.
Unlike the existing similar approaches, VAX uses the concept of Jumping Emerging Patterns to identify and aggregate several diversified patterns, producing explanations through logic combinations of data variables.
arXiv Detail & Related papers (2021-06-21T13:49:44Z) - Autoencoding Variational Autoencoder [56.05008520271406]
We study the implications of this behaviour on the learned representations and also the consequences of fixing it by introducing a notion of self consistency.
We show that encoders trained with our self-consistency approach lead to representations that are robust (insensitive) to perturbations in the input introduced by adversarial attacks.
arXiv Detail & Related papers (2020-12-07T14:16:14Z) - Category-Learning with Context-Augmented Autoencoder [63.05016513788047]
Finding an interpretable non-redundant representation of real-world data is one of the key problems in Machine Learning.
We propose a novel method of using data augmentations when training autoencoders.
We train a Variational Autoencoder in such a way, that it makes transformation outcome predictable by auxiliary network.
arXiv Detail & Related papers (2020-10-10T14:04:44Z) - Deterministic Decoding for Discrete Data in Variational Autoencoders [5.254093731341154]
We study a VAE model with a deterministic decoder (DD-VAE) for sequential data that selects the highest-scoring tokens instead of sampling.
We demonstrate the performance of DD-VAE on multiple datasets, including molecular generation and optimization problems.
arXiv Detail & Related papers (2020-03-04T16:36:52Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.