PromptCrafter: Crafting Text-to-Image Prompt through Mixed-Initiative
Dialogue with LLM
- URL: http://arxiv.org/abs/2307.08985v1
- Date: Tue, 18 Jul 2023 05:51:00 GMT
- Title: PromptCrafter: Crafting Text-to-Image Prompt through Mixed-Initiative
Dialogue with LLM
- Authors: Seungho Baek, Hyerin Im, Jiseung Ryu, Juhyeong Park, Takyeon Lee
- Abstract summary: We present PromptCrafter, a novel mixed-initiative system that allows step-by-step crafting of text-to-image prompt.
Through the iterative process, users can efficiently explore the model's capability, and clarify their intent.
PromptCrafter also supports users to refine prompts by answering various responses to clarifying questions generated by a Large Language Model.
- Score: 2.2894985490441377
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Text-to-image generation model is able to generate images across a diverse
range of subjects and styles based on a single prompt. Recent works have
proposed a variety of interaction methods that help users understand the
capabilities of models and utilize them. However, how to support users to
efficiently explore the model's capability and to create effective prompts are
still open-ended research questions. In this paper, we present PromptCrafter, a
novel mixed-initiative system that allows step-by-step crafting of
text-to-image prompt. Through the iterative process, users can efficiently
explore the model's capability, and clarify their intent. PromptCrafter also
supports users to refine prompts by answering various responses to clarifying
questions generated by a Large Language Model. Lastly, users can revert to a
desired step by reviewing the work history. In this workshop paper, we discuss
the design process of PromptCrafter and our plans for follow-up studies.
Related papers
- Exploring Prompt Engineering Practices in the Enterprise [3.7882262667445734]
A prompt is a natural language instruction designed to elicit certain behaviour or output from a model.
For complex tasks and tasks with specific requirements, prompt design is not trivial.
We analyze sessions of prompt editing behavior, categorizing the parts of prompts users iterated on and the types of changes they made.
arXiv Detail & Related papers (2024-03-13T20:32:32Z) - Prompt Expansion for Adaptive Text-to-Image Generation [51.67811570987088]
This paper proposes a Prompt Expansion framework that helps users generate high-quality, diverse images with less effort.
The Prompt Expansion model takes a text query as input and outputs a set of expanded text prompts.
We conduct a human evaluation study that shows that images generated through Prompt Expansion are more aesthetically pleasing and diverse than those generated by baseline methods.
arXiv Detail & Related papers (2023-12-27T21:12:21Z) - Customization Assistant for Text-to-image Generation [40.76198867803018]
We propose a new framework consists of a new model design and a novel training strategy.
The resulting assistant can perform customized generation in 2-5 seconds without any test time fine-tuning.
arXiv Detail & Related papers (2023-12-05T16:54:42Z) - PromptMagician: Interactive Prompt Engineering for Text-to-Image
Creation [16.41459454076984]
This research proposes PromptMagician, a visual analysis system that helps users explore the image results and refine the input prompts.
The backbone of our system is a prompt recommendation model that takes user prompts as input, retrieves similar prompt-image pairs from DiffusionDB, and identifies special (important and relevant) prompt keywords.
arXiv Detail & Related papers (2023-07-18T07:46:25Z) - SSP: Self-Supervised Post-training for Conversational Search [63.28684982954115]
We propose fullmodel (model) which is a new post-training paradigm with three self-supervised tasks to efficiently initialize the conversational search model.
To verify the effectiveness of our proposed method, we apply the conversational encoder post-trained by model on the conversational search task using two benchmark datasets: CAsT-19 and CAsT-20.
arXiv Detail & Related papers (2023-07-02T13:36:36Z) - Frugal Prompting for Dialog Models [17.048111072193933]
This study examines different approaches for building dialog systems using large language models (LLMs)
As part of prompt tuning, we experiment with various ways of providing instructions, exemplars, current query and additional context.
The research also analyzes the representations of dialog history that have the optimal usable-information density.
arXiv Detail & Related papers (2023-05-24T09:06:49Z) - Promptify: Text-to-Image Generation through Interactive Prompt
Exploration with Large Language Models [29.057923932305123]
We present Promptify, an interactive system that supports prompt exploration and refinement for text-to-image generative models.
Our user study shows that Promptify effectively facilitates the text-to-image workflow and outperforms an existing baseline tool widely used for text-to-image generation.
arXiv Detail & Related papers (2023-04-18T22:59:11Z) - TEMPERA: Test-Time Prompting via Reinforcement Learning [57.48657629588436]
We propose Test-time Prompt Editing using Reinforcement learning (TEMPERA)
In contrast to prior prompt generation methods, TEMPERA can efficiently leverage prior knowledge.
Our method achieves 5.33x on average improvement in sample efficiency when compared to the traditional fine-tuning methods.
arXiv Detail & Related papers (2022-11-21T22:38:20Z) - Interactive and Visual Prompt Engineering for Ad-hoc Task Adaptation
with Large Language Models [116.25562358482962]
State-of-the-art neural language models can be used to solve ad-hoc language tasks without the need for supervised training.
PromptIDE allows users to experiment with prompt variations, visualize prompt performance, and iteratively optimize prompts.
arXiv Detail & Related papers (2022-08-16T17:17:53Z) - Manual-Guided Dialogue for Flexible Conversational Agents [84.46598430403886]
How to build and use dialogue data efficiently, and how to deploy models in different domains at scale can be critical issues in building a task-oriented dialogue system.
We propose a novel manual-guided dialogue scheme, where the agent learns the tasks from both dialogue and manuals.
Our proposed scheme reduces the dependence of dialogue models on fine-grained domain ontology, and makes them more flexible to adapt to various domains.
arXiv Detail & Related papers (2022-08-16T08:21:12Z) - Towards Large-Scale Interpretable Knowledge Graph Reasoning for Dialogue
Systems [109.16553492049441]
We propose a novel method to incorporate the knowledge reasoning capability into dialogue systems in a more scalable and generalizable manner.
To the best of our knowledge, this is the first work to have transformer models generate responses by reasoning over differentiable knowledge graphs.
arXiv Detail & Related papers (2022-03-20T17:51:49Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.