Related papers: Expanding the Generative AI Design Space through Structured Prompting and Multimodal Interfaces

Expanding the Generative AI Design Space through Structured Prompting and Multimodal Interfaces

URL: http://arxiv.org/abs/2504.14320v2
Date: Tue, 22 Apr 2025 17:59:41 GMT
Title: Expanding the Generative AI Design Space through Structured Prompting and Multimodal Interfaces
Authors: Nimisha Karnatak, Adrien Baranes, Rob Marchant, Huinan Zeng, Tríona Butler, Kristen Olson,
Abstract summary: ACAI (AI Co-Creation for Advertising and Inspiration) is a multimodal generative AI tool designed to support novice designers by moving beyond traditional prompt interfaces.<n>This work contributes to HCI research on generative systems by showing how structured interfaces can foreground user-defined context, improve alignment, and enhance co-creative control in novice creative.
Score: 1.051328497890725
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Text-based prompting remains the predominant interaction paradigm in generative AI, yet it often introduces friction for novice users such as small business owners (SBOs), who struggle to articulate creative goals in domain-specific contexts like advertising. Through a formative study with six SBOs in the United Kingdom, we identify three key challenges: difficulties in expressing brand intuition through prompts, limited opportunities for fine-grained adjustment and refinement during and after content generation, and the frequent production of generic content that lacks brand specificity. In response, we present ACAI (AI Co-Creation for Advertising and Inspiration), a multimodal generative AI tool designed to support novice designers by moving beyond traditional prompt interfaces. ACAI features a structured input system composed of three panels: Branding, Audience and Goals, and the Inspiration Board. These inputs allow users to convey brand-relevant context and visual preferences. This work contributes to HCI research on generative systems by showing how structured interfaces can foreground user-defined context, improve alignment, and enhance co-creative control in novice creative workflows.

Related papers

PromptCanvas: Composable Prompting Workspaces Using Dynamic Widgets for Exploration and Iteration in Creative Writing [25.41215417987532]
We introduce PromptCanvas, a concept that transforms prompting into a composable, widget-based experience on an infinite canvas.<n>Users can generate, customize, and arrange interactive widgets representing various facets of their text, offering greater control over AI-generated content.
arXiv Detail & Related papers (2025-06-04T09:13:51Z)
POET: Supporting Prompting Creativity and Personalization with Automated Expansion of Text-to-Image Generation [31.886910258606875]
State-of-the-art visual generative AI tools hold immense potential to assist users in the early ideation stages of creative tasks.<n>Many large-scale text-to-image systems are designed for broad applicability, yielding conventional output that may limit creative exploration.<n>We introduce POET, a real-time interactive tool that automatically discovers dimensions of homogeneity in text-to-image generative models.
arXiv Detail & Related papers (2025-04-18T00:54:36Z)
Piece it Together: Part-Based Concepting with IP-Priors [52.01640707131325]
We introduce a generative framework that seamlessly integrates a partial set of user-provided visual components into a coherent composition. Our approach builds on a strong and underexplored representation space, extracted from IP-Adapter+. We also present a LoRA-based fine-tuning strategy that significantly improves prompt adherence in IP-Adapter+ for a given task.
arXiv Detail & Related papers (2025-03-13T13:46:10Z)
ACAI for SBOs: AI Co-creation for Advertising and Inspiration for Small Business Owners [1.114004309769802]
Small business owners (SBOs) often lack the resources and design experience needed to produce high-quality advertisements.<n>We developed ACAI (AI Co-Creation for Advertising and Inspiration), an GenAI-powered multimodal advertisement creation tool.<n>We conducted a user study with 16 SBOs in London to explore their perceptions of and interactions with ACAI in advertisement creation.
arXiv Detail & Related papers (2025-03-09T19:00:36Z)
GLDesigner: Leveraging Multi-Modal LLMs as Designer for Enhanced Aesthetic Text Glyph Layouts [53.568057283934714]
We propose a Vision-Language Model (VLM)-based framework that generates content-aware text logo layouts.<n>We introduce two model techniques that reduce the computational cost for processing multiple glyph images simultaneously.<n>To support instruction tuning of our model, we construct two extensive text logo datasets that are five times larger than existing public datasets.
arXiv Detail & Related papers (2024-11-18T10:04:10Z)
Survey of User Interface Design and Interaction Techniques in Generative AI Applications [79.55963742878684]
We aim to create a compendium of different user-interaction patterns that can be used as a reference for designers and developers alike. We also strive to lower the entry barrier for those attempting to learn more about the design of generative AI applications.
arXiv Detail & Related papers (2024-10-28T23:10:06Z)
A Novel Idea Generation Tool using a Structured Conversational AI (CAI) System [0.0]
This paper presents a novel conversational AI-enabled active ideation interface as a creative idea-generation tool to assist novice designers. It is a dynamic, interactive, and contextually responsive approach, actively involving a large language model (LLM) from the domain of natural language processing (NLP) in artificial intelligence (AI) Integrating such AI models with ideation creates what we refer to as an Active Ideation scenario, which helps foster continuous dialogue-based interaction, context-sensitive conversation, and prolific idea generation.
arXiv Detail & Related papers (2024-09-09T16:02:27Z)
Constraining Participation: Affordances of Feedback Features in Interfaces to Large Language Models [49.74265453289855]
Large language models (LLMs) are now accessible to anyone with a computer, a web browser, and an internet connection via browser-based interfaces. This paper examines the affordances of interactive feedback features in ChatGPT's interface, analysing how they shape user input and participation in iteration.
arXiv Detail & Related papers (2024-08-27T13:50:37Z)
Empowering Visual Creativity: A Vision-Language Assistant to Image Editing Recommendations [109.65267337037842]
We introduce the task of Image Editing Recommendation (IER) IER aims to automatically generate diverse creative editing instructions from an input image and a simple prompt representing the users' under-specified editing purpose. We introduce Creativity-Vision Language Assistant(Creativity-VLA), a multimodal framework designed specifically for edit-instruction generation.
arXiv Detail & Related papers (2024-05-31T18:22:29Z)
How Human-Centered Explainable AI Interface Are Designed and Evaluated: A Systematic Survey [48.97104365617498]
The emerging area of em Explainable Interfaces (EIs) focuses on the user interface and user experience design aspects of XAI. This paper presents a systematic survey of 53 publications to identify current trends in human-XAI interaction and promising directions for EI design and development.
arXiv Detail & Related papers (2024-03-21T15:44:56Z)
Towards More Unified In-context Visual Understanding [74.55332581979292]
We present a new ICL framework for visual understanding with multi-modal output enabled. First, we quantize and embed both text and visual prompt into a unified representational space. Then a decoder-only sparse transformer architecture is employed to perform generative modeling on them.
arXiv Detail & Related papers (2023-12-05T06:02:21Z)
The role of interface design on prompt-mediated creativity in Generative AI [0.0]
We analyze more than 145,000 prompts from two Generative AI platforms. We find that users exhibit a tendency towards exploration of new topics over exploitation of concepts visited previously.
arXiv Detail & Related papers (2023-11-30T22:33:34Z)
How to Prompt? Opportunities and Challenges of Zero- and Few-Shot Learning for Human-AI Interaction in Creative Applications of Generative Models [29.420160518026496]
We discuss the opportunities and challenges for interactive creative applications that use prompting as a new paradigm for Human-AI interaction. Based on our analysis, we propose four design goals for user interfaces that support prompting. We illustrate these with concrete UI design sketches, focusing on the use case of creative writing.
arXiv Detail & Related papers (2022-09-03T10:16:34Z)
Multimodal Dialog Systems with Dual Knowledge-enhanced Generative Pretrained Language Model [63.461030694700014]
We propose a novel dual knowledge-enhanced generative pretrained language model for multimodal task-oriented dialog systems (DKMD) The proposed DKMD consists of three key components: dual knowledge selection, dual knowledge-enhanced context learning, and knowledge-enhanced response generation. Experiments on a public dataset verify the superiority of the proposed DKMD over state-of-the-art competitors.
arXiv Detail & Related papers (2022-07-16T13:02:54Z)

This list is automatically generated from the titles and abstracts of the papers in this site.