Luminate: Structured Generation and Exploration of Design Space with Large Language Models for Human-AI Co-Creation
- URL: http://arxiv.org/abs/2310.12953v3
- Date: Wed, 13 Mar 2024 19:50:00 GMT
- Title: Luminate: Structured Generation and Exploration of Design Space with Large Language Models for Human-AI Co-Creation
- Authors: Sangho Suh, Meng Chen, Bryan Min, Toby Jia-Jun Li, Haijun Xia,
- Abstract summary: We argue that current interaction paradigms fall short, guiding users towards rapid convergence on a limited set of ideas.
We propose a framework that facilitates the structured generation of design space in which users can seamlessly explore, evaluate, and synthesize a multitude of responses.
- Score: 19.62178304006683
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Thanks to their generative capabilities, large language models (LLMs) have become an invaluable tool for creative processes. These models have the capacity to produce hundreds and thousands of visual and textual outputs, offering abundant inspiration for creative endeavors. But are we harnessing their full potential? We argue that current interaction paradigms fall short, guiding users towards rapid convergence on a limited set of ideas, rather than empowering them to explore the vast latent design space in generative models. To address this limitation, we propose a framework that facilitates the structured generation of design space in which users can seamlessly explore, evaluate, and synthesize a multitude of responses. We demonstrate the feasibility and usefulness of this framework through the design and development of an interactive system, Luminate, and a user study with 14 professional writers. Our work advances how we interact with LLMs for creative tasks, introducing a way to harness the creative potential of LLMs.
Related papers
- A Framework for Collaborating a Large Language Model Tool in Brainstorming for Triggering Creative Thoughts [2.709166684084394]
This study proposes a framework called GPS, which employs goals, prompts, and strategies to guide designers to systematically work with an LLM tool for improving the creativity of ideas generated during brainstorming.
Our framework, tested through a design example and a case study, demonstrates its effectiveness in stimulating creativity and its seamless LLM tool integration into design practices.
arXiv Detail & Related papers (2024-10-10T13:39:27Z) - MetaDesigner: Advancing Artistic Typography through AI-Driven, User-Centric, and Multilingual WordArt Synthesis [65.78359025027457]
MetaDesigner revolutionizes artistic typography by leveraging the strengths of Large Language Models (LLMs) to drive a design paradigm centered around user engagement.
A comprehensive feedback mechanism harnesses insights from multimodal models and user evaluations to refine and enhance the design process iteratively.
Empirical validations highlight MetaDesigner's capability to effectively serve diverse WordArt applications, consistently producing aesthetically appealing and context-sensitive results.
arXiv Detail & Related papers (2024-06-28T11:58:26Z) - LLM2FEA: Discover Novel Designs with Generative Evolutionary Multitasking [21.237950330178354]
We propose the first attempt to discover novel designs in generative models by transferring knowledge across multiple domains.
By utilizing a multi-factorial evolutionary algorithm (MFEA) to drive a large language model, LLM2FEA integrates knowledge from various fields to generate prompts that guide the generative model in discovering novel and practical objects.
arXiv Detail & Related papers (2024-06-21T07:20:51Z) - Creativity Has Left the Chat: The Price of Debiasing Language Models [1.223779595809275]
We investigate the unintended consequences of Reinforcement Learning from Human Feedback on the creativity of Large Language Models (LLMs)
Our findings have significant implications for marketers who rely on LLMs for creative tasks such as copywriting, ad creation, and customer persona generation.
arXiv Detail & Related papers (2024-06-08T22:14:51Z) - Divergent Creativity in Humans and Large Language Models [37.67363469600804]
The recent surge in the capabilities of Large Language Models has led to claims that they are approaching a level of creativity akin to human capabilities.
We leverage recent advances in creativity science to build a framework for in-depth analysis of divergent creativity in both state-of-the-art LLMs and a substantial dataset of 100,000 humans.
arXiv Detail & Related papers (2024-05-13T22:37:52Z) - LVLM-Interpret: An Interpretability Tool for Large Vision-Language Models [50.259006481656094]
We present a novel interactive application aimed towards understanding the internal mechanisms of large vision-language models.
Our interface is designed to enhance the interpretability of the image patches, which are instrumental in generating an answer.
We present a case study of how our application can aid in understanding failure mechanisms in a popular large multi-modal model: LLaVA.
arXiv Detail & Related papers (2024-04-03T23:57:34Z) - I-Design: Personalized LLM Interior Designer [57.00412237555167]
I-Design is a personalized interior designer that allows users to generate and visualize their design goals through natural language communication.
I-Design starts with a team of large language model agents that engage in dialogues and logical reasoning with one another.
The final design is then constructed in 3D by retrieving and integrating assets from an existing object database.
arXiv Detail & Related papers (2024-04-03T16:17:53Z) - Multi-modal Instruction Tuned LLMs with Fine-grained Visual Perception [63.03288425612792]
We propose bfAnyRef, a general MLLM model that can generate pixel-wise object perceptions and natural language descriptions from multi-modality references.
Our model achieves state-of-the-art results across multiple benchmarks, including diverse modality referring segmentation and region-level referring expression generation.
arXiv Detail & Related papers (2024-03-05T13:45:46Z) - Towards More Unified In-context Visual Understanding [74.55332581979292]
We present a new ICL framework for visual understanding with multi-modal output enabled.
First, we quantize and embed both text and visual prompt into a unified representational space.
Then a decoder-only sparse transformer architecture is employed to perform generative modeling on them.
arXiv Detail & Related papers (2023-12-05T06:02:21Z) - ConceptLab: Creative Concept Generation using VLM-Guided Diffusion Prior
Constraints [56.824187892204314]
We present the task of creative text-to-image generation, where we seek to generate new members of a broad category.
We show that the creative generation problem can be formulated as an optimization process over the output space of the diffusion prior.
We incorporate a question-answering Vision-Language Model (VLM) that adaptively adds new constraints to the optimization problem, encouraging the model to discover increasingly more unique creations.
arXiv Detail & Related papers (2023-08-03T17:04:41Z) - How to Prompt? Opportunities and Challenges of Zero- and Few-Shot
Learning for Human-AI Interaction in Creative Applications of Generative
Models [29.420160518026496]
We discuss the opportunities and challenges for interactive creative applications that use prompting as a new paradigm for Human-AI interaction.
Based on our analysis, we propose four design goals for user interfaces that support prompting.
We illustrate these with concrete UI design sketches, focusing on the use case of creative writing.
arXiv Detail & Related papers (2022-09-03T10:16:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.