Related papers: Disentanglement Analysis in Deep Latent Variable Models Matching Aggregate Posterior Distributions

Disentanglement Analysis in Deep Latent Variable Models Matching Aggregate Posterior Distributions

URL: http://arxiv.org/abs/2501.15705v1
Date: Sun, 26 Jan 2025 23:38:39 GMT
Title: Disentanglement Analysis in Deep Latent Variable Models Matching Aggregate Posterior Distributions
Authors: Surojit Saha, Sarang Joshi, Ross Whitaker,
Abstract summary: We propose a method to evaluate disentanglement for deep latent variable models (DLVMs) in general.<n>The proposed technique discovers the latent vectors representing the generative factors of a dataset that can be different from the cardinal latent axes.
Score: 0.5759862457142761
License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
Abstract: Deep latent variable models (DLVMs) are designed to learn meaningful representations in an unsupervised manner, such that the hidden explanatory factors are interpretable by independent latent variables (aka disentanglement). The variational autoencoder (VAE) is a popular DLVM widely studied in disentanglement analysis due to the modeling of the posterior distribution using a factorized Gaussian distribution that encourages the alignment of the latent factors with the latent axes. Several metrics have been proposed recently, assuming that the latent variables explaining the variation in data are aligned with the latent axes (cardinal directions). However, there are other DLVMs, such as the AAE and WAE-MMD (matching the aggregate posterior to the prior), where the latent variables might not be aligned with the latent axes. In this work, we propose a statistical method to evaluate disentanglement for any DLVMs in general. The proposed technique discovers the latent vectors representing the generative factors of a dataset that can be different from the cardinal latent axes. We empirically demonstrate the advantage of the method on two datasets.

Related papers

A Random Matrix Theory Perspective on the Consistency of Diffusion Models [31.63433424187031]
Diffusion models trained on different subsets of a dataset often produce strikingly similar outputs when given the same noise seed.<n>We develop a random matrix theory (RMT) framework that quantifies how finite shape the expectation and variance of the learned denoiser and sampling map.<n>We validate its predictions on UNet and DiT architectures in their non-memorization regime.
arXiv Detail & Related papers (2026-02-02T23:30:28Z)
An Elementary Approach to Scheduling in Generative Diffusion Models [55.171367482496755]
An elementary approach to characterizing the impact of noise scheduling and time discretization in generative diffusion models is developed.<n> Experiments across different datasets and pretrained models demonstrate that the time discretization strategy selected by our approach consistently outperforms baseline and search-based strategies.
arXiv Detail & Related papers (2026-01-20T05:06:26Z)
Variational Inference for Latent Variable Models in High Dimensions [4.3012765978447565]
We introduce a general framework for quantifying the statistical accuracy of mean-field variational inference (MFVI)<n>We capture the exact regime where MFVI 'works' for the celebrated latent Dirichlet allocation model.<n>Our proof techniques, which extend the framework of nonlinear large deviations, open the door for the analysis of MFVI in other latent variable models.
arXiv Detail & Related papers (2025-06-02T17:19:58Z)
Causal vs. Anticausal merging of predictors [57.26526031579287]
We study the differences arising from merging predictors in the causal and anticausal directions using the same data.<n>We use Causal Maximum Entropy (CMAXENT) as inductive bias to merge the predictors.
arXiv Detail & Related papers (2025-01-14T20:38:15Z)
Unpicking Data at the Seams: Understanding Disentanglement in VAEs [1.2352619722637816]
We show how diagonal posteriors "lock" a decoder's local axes so that density over the data manifold factorises along independent one-dimensional seams.<n>This gives a clear definition of disentanglement, explains why it emerges in VAEs and shows that, under stated assumptions, ground truth factors are identifiable even with a symmetric prior.
arXiv Detail & Related papers (2024-10-29T21:54:18Z)
A Sparsity Principle for Partially Observable Causal Representation Learning [28.25303444099773]
Causal representation learning aims at identifying high-level causal variables from perceptual data. We focus on learning from unpaired observations from a dataset with an instance-dependent partial observability pattern. We propose two methods for estimating the underlying causal variables by enforcing sparsity in the inferred representation.
arXiv Detail & Related papers (2024-03-13T08:40:49Z)
On the Strong Correlation Between Model Invariance and Generalization [54.812786542023325]
Generalization captures a model's ability to classify unseen data. Invariance measures consistency of model predictions on transformations of the data. From a dataset-centric view, we find a certain model's accuracy and invariance linearly correlated on different test sets.
arXiv Detail & Related papers (2022-07-14T17:08:25Z)
ER: Equivariance Regularizer for Knowledge Graph Completion [107.51609402963072]
We propose a new regularizer, namely, Equivariance Regularizer (ER) ER can enhance the generalization ability of the model by employing the semantic equivariance between the head and tail entities. The experimental results indicate a clear and substantial improvement over the state-of-the-art relation prediction methods.
arXiv Detail & Related papers (2022-06-24T08:18:05Z)
Equivariance Discovery by Learned Parameter-Sharing [153.41877129746223]
We study how to discover interpretable equivariances from data. Specifically, we formulate this discovery process as an optimization problem over a model's parameter-sharing schemes. Also, we theoretically analyze the method for Gaussian data and provide a bound on the mean squared gap between the studied discovery scheme and the oracle scheme.
arXiv Detail & Related papers (2022-04-07T17:59:19Z)
Linear Discriminant Analysis with High-dimensional Mixed Variables [10.774094462083843]
This paper develops a novel approach for classifying high-dimensional observations with mixed variables. We overcome the challenge of having to split data into exponentially many cells. Results on the estimation accuracy and the misclassification rates are established.
arXiv Detail & Related papers (2021-12-14T03:57:56Z)
Latent Causal Invariant Model [128.7508609492542]
Current supervised learning can learn spurious correlation during the data-fitting process. We propose a Latent Causal Invariance Model (LaCIM) which pursues causal prediction.
arXiv Detail & Related papers (2020-11-04T10:00:27Z)
Learning Disentangled Representations with Latent Variation Predictability [102.4163768995288]
This paper defines the variation predictability of latent disentangled representations. Within an adversarial generation process, we encourage variation predictability by maximizing the mutual information between latent variations and corresponding image pairs. We develop an evaluation metric that does not rely on the ground-truth generative factors to measure the disentanglement of latent representations.
arXiv Detail & Related papers (2020-07-25T08:54:26Z)
tvGP-VAE: Tensor-variate Gaussian Process Prior Variational Autoencoder [0.0]
tvGP-VAE is able to explicitly model correlation via the use of kernel functions. We show that the choice of which correlation structures to explicitly represent in the latent space has a significant impact on model performance.
arXiv Detail & Related papers (2020-06-08T17:59:13Z)
NestedVAE: Isolating Common Factors via Weak Supervision [45.366986365879505]
We identify the connection between the task of bias reduction and that of isolating factors common between domains. To isolate the common factors we combine the theory of deep latent variable models with information bottleneck theory. Two outer VAEs with shared weights attempt to reconstruct the input and infer a latent space, whilst a nested VAE attempts to reconstruct the latent representation of one image, from the latent representation of its paired image.
arXiv Detail & Related papers (2020-02-26T15:49:57Z)

This list is automatically generated from the titles and abstracts of the papers in this site.