An Image is Worth More Than a Thousand Words: Towards Disentanglement in
the Wild
- URL: http://arxiv.org/abs/2106.15610v1
- Date: Tue, 29 Jun 2021 17:54:24 GMT
- Title: An Image is Worth More Than a Thousand Words: Towards Disentanglement in
the Wild
- Authors: Aviv Gabbay, Niv Cohen, Yedid Hoshen
- Abstract summary: Unsupervised disentanglement has been shown to be theoretically impossible without inductive biases on the models and the data.
We propose a method for disentangling a set of factors which are only partially labeled, as well as separating the complementary set of residual factors.
- Score: 34.505472771669744
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Unsupervised disentanglement has been shown to be theoretically impossible
without inductive biases on the models and the data. As an alternative
approach, recent methods rely on limited supervision to disentangle the factors
of variation and allow their identifiability. While annotating the true
generative factors is only required for a limited number of observations, we
argue that it is infeasible to enumerate all the factors of variation that
describe a real-world image distribution. To this end, we propose a method for
disentangling a set of factors which are only partially labeled, as well as
separating the complementary set of residual factors that are never explicitly
specified. Our success in this challenging setting, demonstrated on synthetic
benchmarks, gives rise to leveraging off-the-shelf image descriptors to
partially annotate a subset of attributes in real image domains (e.g. of human
faces) with minimal manual effort. Specifically, we use a recent language-image
embedding model (CLIP) to annotate a set of attributes of interest in a
zero-shot manner and demonstrate state-of-the-art disentangled image
manipulation results.
Related papers
Err
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.