Where and What? Examining Interpretable Disentangled Representations
- URL: http://arxiv.org/abs/2104.05622v1
- Date: Wed, 7 Apr 2021 11:22:02 GMT
- Title: Where and What? Examining Interpretable Disentangled Representations
- Authors: Xinqi Zhu, Chang Xu, Dacheng Tao
- Abstract summary: Capturing interpretable variations has long been one of the goals in disentanglement learning.
Unlike the independence assumption, interpretability has rarely been exploited to encourage disentanglement in the unsupervised setting.
In this paper, we examine the interpretability of disentangled representations by investigating two questions: where to be interpreted and what to be interpreted.
- Score: 96.32813624341833
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Capturing interpretable variations has long been one of the goals in
disentanglement learning. However, unlike the independence assumption,
interpretability has rarely been exploited to encourage disentanglement in the
unsupervised setting. In this paper, we examine the interpretability of
disentangled representations by investigating two questions: where to be
interpreted and what to be interpreted? A latent code is easily to be
interpreted if it would consistently impact a certain subarea of the resulting
generated image. We thus propose to learn a spatial mask to localize the effect
of each individual latent dimension. On the other hand, interpretability
usually comes from latent dimensions that capture simple and basic variations
in data. We thus impose a perturbation on a certain dimension of the latent
code, and expect to identify the perturbation along this dimension from the
generated images so that the encoding of simple variations can be enforced.
Additionally, we develop an unsupervised model selection method, which
accumulates perceptual distance scores along axes in the latent space. On
various datasets, our models can learn high-quality disentangled
representations without supervision, showing the proposed modeling of
interpretability is an effective proxy for achieving unsupervised
disentanglement.
Related papers
- Decoding Diffusion: A Scalable Framework for Unsupervised Analysis of Latent Space Biases and Representations Using Natural Language Prompts [68.48103545146127]
This paper proposes a novel framework for unsupervised exploration of diffusion latent spaces.
We directly leverage natural language prompts and image captions to map latent directions.
Our method provides a more scalable and interpretable understanding of the semantic knowledge encoded within diffusion models.
arXiv Detail & Related papers (2024-10-25T21:44:51Z) - Semantic uncertainty intervals for disentangled latent spaces [30.254614465166245]
We provide principled uncertainty intervals guaranteed to contain the true semantic factors for any underlying generative model.
This technique reliably communicates semantically meaningful, principled, and instance-adaptive uncertainty in inverse problems like image super-resolution and image completion.
arXiv Detail & Related papers (2022-07-20T17:58:10Z) - Weakly Supervised Representation Learning with Sparse Perturbations [82.39171485023276]
We show that if one has weak supervision from observations generated by sparse perturbations of the latent variables, identification is achievable under unknown continuous latent distributions.
We propose a natural estimation procedure based on this theory and illustrate it on low-dimensional synthetic and image-based experiments.
arXiv Detail & Related papers (2022-06-02T15:30:07Z) - Evidential Sparsification of Multimodal Latent Spaces in Conditional
Variational Autoencoders [63.46738617561255]
We consider the problem of sparsifying the discrete latent space of a trained conditional variational autoencoder.
We use evidential theory to identify the latent classes that receive direct evidence from a particular input condition and filter out those that do not.
Experiments on diverse tasks, such as image generation and human behavior prediction, demonstrate the effectiveness of our proposed technique.
arXiv Detail & Related papers (2020-10-19T01:27:21Z) - Learning Disentangled Representations with Latent Variation
Predictability [102.4163768995288]
This paper defines the variation predictability of latent disentangled representations.
Within an adversarial generation process, we encourage variation predictability by maximizing the mutual information between latent variations and corresponding image pairs.
We develop an evaluation metric that does not rely on the ground-truth generative factors to measure the disentanglement of latent representations.
arXiv Detail & Related papers (2020-07-25T08:54:26Z) - Learning to Manipulate Individual Objects in an Image [71.55005356240761]
We describe a method to train a generative model with latent factors that are independent and localized.
This means that perturbing the latent variables affects only local regions of the synthesized image, corresponding to objects.
Unlike other unsupervised generative models, ours enables object-centric manipulation, without requiring object-level annotations.
arXiv Detail & Related papers (2020-04-11T21:50:20Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.