Unsupervised Compositional Concepts Discovery with Text-to-Image
Generative Models
- URL: http://arxiv.org/abs/2306.05357v2
- Date: Thu, 3 Aug 2023 17:07:41 GMT
- Title: Unsupervised Compositional Concepts Discovery with Text-to-Image
Generative Models
- Authors: Nan Liu, Yilun Du, Shuang Li, Joshua B. Tenenbaum, Antonio Torralba
- Abstract summary: In this paper, we consider the inverse problem -- given a collection of different images, can we discover the generative concepts that represent each image?
We present an unsupervised approach to discover generative concepts from a collection of images, disentangling different art styles in paintings, objects, and lighting from kitchen scenes, and discovering image classes given ImageNet images.
- Score: 80.75258849913574
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Text-to-image generative models have enabled high-resolution image synthesis
across different domains, but require users to specify the content they wish to
generate. In this paper, we consider the inverse problem -- given a collection
of different images, can we discover the generative concepts that represent
each image? We present an unsupervised approach to discover generative concepts
from a collection of images, disentangling different art styles in paintings,
objects, and lighting from kitchen scenes, and discovering image classes given
ImageNet images. We show how such generative concepts can accurately represent
the content of images, be recombined and composed to generate new artistic and
hybrid images, and be further used as a representation for downstream
classification tasks.
Related papers
- Visual Concept-driven Image Generation with Text-to-Image Diffusion Model [65.96212844602866]
Text-to-image (TTI) models have demonstrated impressive results in generating high-resolution images of complex scenes.
Recent approaches have extended these methods with personalization techniques that allow them to integrate user-illustrated concepts.
However, the ability to generate images with multiple interacting concepts, such as human subjects, as well as concepts that may be entangled in one, or across multiple, image illustrations remains illusive.
We propose a concept-driven TTI personalization framework that addresses these core challenges.
arXiv Detail & Related papers (2024-02-18T07:28:37Z) - DiffMorph: Text-less Image Morphing with Diffusion Models [0.0]
verb|DiffMorph| synthesizes images that mix concepts without the use of textual prompts.
verb|DiffMorph| takes an initial image with conditioning artist-drawn sketches to generate a morphed image.
We employ a pre-trained text-to-image diffusion model and fine-tune it to reconstruct each image faithfully.
arXiv Detail & Related papers (2024-01-01T12:42:32Z) - Semantic Draw Engineering for Text-to-Image Creation [2.615648035076649]
We propose a method that utilizes artificial intelligence models for thematic creativity.
The method involves converting all visual elements into quantifiable data structures before creating images.
We evaluate the effectiveness of this approach in terms of semantic accuracy, image efficiency, and computational efficiency.
arXiv Detail & Related papers (2023-12-23T05:35:15Z) - Decoupled Textual Embeddings for Customized Image Generation [62.98933630971543]
Customized text-to-image generation aims to learn user-specified concepts with a few images.
Existing methods usually suffer from overfitting issues and entangle the subject-unrelated information with the learned concept.
We propose the DETEX, a novel approach that learns the disentangled concept embedding for flexible customized text-to-image generation.
arXiv Detail & Related papers (2023-12-19T03:32:10Z) - Taming Encoder for Zero Fine-tuning Image Customization with
Text-to-Image Diffusion Models [55.04969603431266]
This paper proposes a method for generating images of customized objects specified by users.
The method is based on a general framework that bypasses the lengthy optimization required by previous approaches.
We demonstrate through experiments that our proposed method is able to synthesize images with compelling output quality, appearance diversity, and object fidelity.
arXiv Detail & Related papers (2023-04-05T17:59:32Z) - Ablating Concepts in Text-to-Image Diffusion Models [57.9371041022838]
Large-scale text-to-image diffusion models can generate high-fidelity images with powerful compositional ability.
These models are typically trained on an enormous amount of Internet data, often containing copyrighted material, licensed images, and personal photos.
We propose an efficient method of ablating concepts in the pretrained model, preventing the generation of a target concept.
arXiv Detail & Related papers (2023-03-23T17:59:42Z) - Plug-and-Play Diffusion Features for Text-Driven Image-to-Image
Translation [10.39028769374367]
We present a new framework that takes text-to-image synthesis to the realm of image-to-image translation.
Our method harnesses the power of a pre-trained text-to-image diffusion model to generate a new image that complies with the target text.
arXiv Detail & Related papers (2022-11-22T20:39:18Z) - Unsupervised Layered Image Decomposition into Object Prototypes [39.20333694585477]
We present an unsupervised learning framework for decomposing images into layers of automatically discovered object models.
We first validate our approach by providing results on par with the state of the art on standard multi-object synthetic benchmarks.
We then demonstrate the applicability of our model to real images in tasks that include clustering (SVHN, GTSRB), cosegmentation (Weizmann Horse) and object discovery from unfiltered social network images.
arXiv Detail & Related papers (2021-04-29T18:02:01Z) - SketchEmbedNet: Learning Novel Concepts by Imitating Drawings [125.45799722437478]
We explore properties of image representations learned by training a model to produce sketches of images.
We show that this generative, class-agnostic model produces informative embeddings of images from novel examples, classes, and even novel datasets in a few-shot setting.
arXiv Detail & Related papers (2020-08-27T16:43:28Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.