Disentangled Representation Learning with Wasserstein Total Correlation
- URL: http://arxiv.org/abs/1912.12818v1
- Date: Mon, 30 Dec 2019 05:31:28 GMT
- Title: Disentangled Representation Learning with Wasserstein Total Correlation
- Authors: Yijun Xiao, William Yang Wang
- Abstract summary: We introduce Wasserstein total correlation in both variational autoencoder and Wasserstein autoencoder settings to learn disentangled latent representations.
A critic is adversarially trained along with the main objective to estimate the Wasserstein total correlation term.
We show that the proposed approach has comparable performances on disentanglement with smaller sacrifices in reconstruction abilities.
- Score: 90.44329632061076
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Unsupervised learning of disentangled representations involves uncovering of
different factors of variations that contribute to the data generation process.
Total correlation penalization has been a key component in recent methods
towards disentanglement. However, Kullback-Leibler (KL) divergence-based total
correlation is metric-agnostic and sensitive to data samples. In this paper, we
introduce Wasserstein total correlation in both variational autoencoder and
Wasserstein autoencoder settings to learn disentangled latent representations.
A critic is adversarially trained along with the main objective to estimate the
Wasserstein total correlation term. We discuss the benefits of using
Wasserstein distance over KL divergence to measure independence and conduct
quantitative and qualitative experiments on several data sets. Moreover, we
introduce a new metric to measure disentanglement. We show that the proposed
approach has comparable performances on disentanglement with smaller sacrifices
in reconstruction abilities.
Related papers
- Synthetic Tabular Data Validation: A Divergence-Based Approach [8.062368743143388]
Divergences quantify discrepancies between data distributions.
Traditional approaches calculate divergences independently for each feature.
We propose a novel approach that uses divergence estimation to overcome the limitations of marginal comparisons.
arXiv Detail & Related papers (2024-05-13T15:07:52Z) - Mutual Wasserstein Discrepancy Minimization for Sequential
Recommendation [82.0801585843835]
We propose a novel self-supervised learning framework based on Mutual WasserStein discrepancy minimization MStein for the sequential recommendation.
We also propose a novel contrastive learning loss based on Wasserstein Discrepancy Measurement.
arXiv Detail & Related papers (2023-01-28T13:38:48Z) - Disentanglement Analysis with Partial Information Decomposition [31.56299813238937]
disentangled representations aim at reversing the process by mapping data to multiple random variables that individually capture distinct generative factors.
Current disentanglement metrics are designed to measure the concentration, e.g., absolute deviation, variance, or entropy, of each variable conditioned by each generative factor.
In this work, we use the Partial Information Decomposition framework to evaluate information sharing between more than two variables, and build a framework, including a new disentanglement metric.
arXiv Detail & Related papers (2021-08-31T11:09:40Z) - Learning disentangled representations with the Wasserstein Autoencoder [22.54887526392739]
We propose TCWAE (Total Correlation Wasserstein Autoencoder) to penalize the total correlation in latent variables.
We show that working in the WAE paradigm naturally enables the separation of the total-correlation term, thus providing disentanglement control over the learned representation.
We further study the trade off between disentanglement and reconstruction on more-difficult data sets with unknown generative factors, where the flexibility of the WAE paradigm in the reconstruction term improves reconstructions.
arXiv Detail & Related papers (2020-10-07T14:52:06Z) - Interpolation and Learning with Scale Dependent Kernels [91.41836461193488]
We study the learning properties of nonparametric ridge-less least squares.
We consider the common case of estimators defined by scale dependent kernels.
arXiv Detail & Related papers (2020-06-17T16:43:37Z) - On Disentangled Representations Learned From Correlated Data [59.41587388303554]
We bridge the gap to real-world scenarios by analyzing the behavior of the most prominent disentanglement approaches on correlated data.
We show that systematically induced correlations in the dataset are being learned and reflected in the latent representations.
We also demonstrate how to resolve these latent correlations, either using weak supervision during training or by post-hoc correcting a pre-trained model with a small number of labels.
arXiv Detail & Related papers (2020-06-14T12:47:34Z) - An Investigation of Why Overparameterization Exacerbates Spurious
Correlations [98.3066727301239]
We identify two key properties of the training data that drive this behavior.
We show how the inductive bias of models towards "memorizing" fewer examples can cause over parameterization to hurt.
arXiv Detail & Related papers (2020-05-09T01:59:13Z) - Learning from Aggregate Observations [82.44304647051243]
We study the problem of learning from aggregate observations where supervision signals are given to sets of instances.
We present a general probabilistic framework that accommodates a variety of aggregate observations.
Simple maximum likelihood solutions can be applied to various differentiable models.
arXiv Detail & Related papers (2020-04-14T06:18:50Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.