PromptPaint: Steering Text-to-Image Generation Through Paint Medium-like
Interactions
- URL: http://arxiv.org/abs/2308.05184v1
- Date: Wed, 9 Aug 2023 18:41:11 GMT
- Title: PromptPaint: Steering Text-to-Image Generation Through Paint Medium-like
Interactions
- Authors: John Joon Young Chung, Eytan Adar
- Abstract summary: PromptPaint allows users to mix prompts that express challenging concepts.
We characterize different approaches for mixing prompts, design trade-offs, and socio-technical challenges for generative models.
- Score: 12.792576041526287
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: While diffusion-based text-to-image (T2I) models provide a simple and
powerful way to generate images, guiding this generation remains a challenge.
For concepts that are difficult to describe through language, users may
struggle to create prompts. Moreover, many of these models are built as
end-to-end systems, lacking support for iterative shaping of the image. In
response, we introduce PromptPaint, which combines T2I generation with
interactions that model how we use colored paints. PromptPaint allows users to
go beyond language to mix prompts that express challenging concepts. Just as we
iteratively tune colors through layered placements of paint on a physical
canvas, PromptPaint similarly allows users to apply different prompts to
different canvas areas and times of the generative process. Through a set of
studies, we characterize different approaches for mixing prompts, design
trade-offs, and socio-technical challenges for generative models. With
PromptPaint we provide insight into future steerable generative tools.
Related papers
- I Dream My Painting: Connecting MLLMs and Diffusion Models via Prompt Generation for Text-Guided Multi-Mask Inpainting [8.94249680213101]
Inpainting focuses on filling missing or corrupted regions of an image to blend seamlessly with its surrounding content and style.
We introduce the novel task of multi-mask inpainting, where multiple regions are simultaneously inpainted using distinct prompts.
Our pipeline delivers creative and accurate inpainting results.
arXiv Detail & Related papers (2024-11-28T10:55:09Z) - Automated Black-box Prompt Engineering for Personalized Text-to-Image Generation [149.96612254604986]
PRISM is an algorithm that automatically identifies human-interpretable and transferable prompts.
It can effectively generate desired concepts given only black-box access to T2I models.
Our experiments demonstrate the versatility and effectiveness of PRISM in generating accurate prompts for objects, styles and images.
arXiv Detail & Related papers (2024-03-28T02:35:53Z) - Towards Language-Driven Video Inpainting via Multimodal Large Language Models [116.22805434658567]
We introduce a new task -- language-driven video inpainting.
It uses natural language instructions to guide the inpainting process.
We present the Remove Objects from Videos by Instructions dataset.
arXiv Detail & Related papers (2024-01-18T18:59:13Z) - HD-Painter: High-Resolution and Prompt-Faithful Text-Guided Image Inpainting with Diffusion Models [59.01600111737628]
HD-Painter is a training free approach that accurately follows prompts and coherently scales to high resolution image inpainting.
To this end, we design the Prompt-Aware Introverted Attention (PAIntA) layer enhancing self-attention scores.
Our experiments demonstrate that HD-Painter surpasses existing state-of-the-art approaches quantitatively and qualitatively.
arXiv Detail & Related papers (2023-12-21T18:09:30Z) - A Task is Worth One Word: Learning with Task Prompts for High-Quality Versatile Image Inpainting [38.53807472111521]
We introduce PowerPaint, the first high-quality and versatile inpainting model that excels in multiple inpainting tasks.
We demonstrate the versatility of the task prompt in PowerPaint by showcasing its effectiveness as a negative prompt for object removal.
We leverage prompt techniques to enable controllable shape-guided object inpainting, enhancing the model's applicability in shape-guided applications.
arXiv Detail & Related papers (2023-12-06T16:34:46Z) - Uni-paint: A Unified Framework for Multimodal Image Inpainting with
Pretrained Diffusion Model [19.800236358666123]
We propose Uni-paint, a unified framework for multimodal inpainting.
Uni-paint offers various modes of guidance, including text-driven, stroke-driven, exemplar-driven inpainting.
Our approach achieves comparable results to existing single-modal methods.
arXiv Detail & Related papers (2023-10-11T06:11:42Z) - PaintSeg: Training-free Segmentation via Painting [50.17936803209125]
PaintSeg is a new unsupervised method for segmenting objects without any training.
Inpainting and outpainting are alternated, with the former masking the foreground and filling in the background, and the latter masking the background while recovering the missing part of the foreground object.
Our experimental results demonstrate that PaintSeg outperforms existing approaches in coarse mask-prompt, box-prompt, and point-prompt segmentation tasks.
arXiv Detail & Related papers (2023-05-30T20:43:42Z) - AI Illustrator: Translating Raw Descriptions into Images by Prompt-based
Cross-Modal Generation [61.77946020543875]
We propose a framework for translating raw descriptions with complex semantics into semantically corresponding images.
Our framework consists of two components: a projection module from Text Embeddings to Image Embeddings based on prompts, and an adapted image generation module built on StyleGAN.
Benefiting from the pre-trained models, our method can handle complex descriptions and does not require external paired data for training.
arXiv Detail & Related papers (2022-09-07T13:53:54Z) - In&Out : Diverse Image Outpainting via GAN Inversion [89.84841983778672]
Image outpainting seeks for a semantically consistent extension of the input image beyond its available content.
In this work, we formulate the problem from the perspective of inverting generative adversarial networks.
Our generator renders micro-patches conditioned on their joint latent code as well as their individual positions in the image.
arXiv Detail & Related papers (2021-04-01T17:59:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.