Related papers: Luminate: Structured Generation and Exploration of Design Space with Large Language Models for Human-AI Co-Creation

Luminate: Structured Generation and Exploration of Design Space with Large Language Models for Human-AI Co-Creation

URL: http://arxiv.org/abs/2310.12953v3
Date: Wed, 13 Mar 2024 19:50:00 GMT
Title: Luminate: Structured Generation and Exploration of Design Space with Large Language Models for Human-AI Co-Creation
Authors: Sangho Suh, Meng Chen, Bryan Min, Toby Jia-Jun Li, Haijun Xia,
Abstract summary: We argue that current interaction paradigms fall short, guiding users towards rapid convergence on a limited set of ideas. We propose a framework that facilitates the structured generation of design space in which users can seamlessly explore, evaluate, and synthesize a multitude of responses.
Score: 19.62178304006683
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Thanks to their generative capabilities, large language models (LLMs) have become an invaluable tool for creative processes. These models have the capacity to produce hundreds and thousands of visual and textual outputs, offering abundant inspiration for creative endeavors. But are we harnessing their full potential? We argue that current interaction paradigms fall short, guiding users towards rapid convergence on a limited set of ideas, rather than empowering them to explore the vast latent design space in generative models. To address this limitation, we propose a framework that facilitates the structured generation of design space in which users can seamlessly explore, evaluate, and synthesize a multitude of responses. We demonstrate the feasibility and usefulness of this framework through the design and development of an interactive system, Luminate, and a user study with 14 professional writers. Our work advances how we interact with LLMs for creative tasks, introducing a way to harness the creative potential of LLMs.

Related papers

Cooking Up Creativity: A Cognitively-Inspired Approach for Enhancing LLM Creativity through Structured Representations [53.950760059792614]
Large Language Models (LLMs) excel at countless tasks, yet struggle with creativity. We introduce a novel approach that couples LLMs with structured representations and cognitively inspired manipulations to generate more creative and diverse ideas. We demonstrate our approach in the culinary domain with DishCOVER, a model that generates creative recipes.
arXiv Detail & Related papers (2025-04-29T11:13:06Z)
Probing and Inducing Combinational Creativity in Vision-Language Models [52.76981145923602]
Recent advances in Vision-Language Models (VLMs) have sparked debate about whether their outputs reflect combinational creativity. We propose the Identification-Explanation-Implication (IEI) framework, which decomposes creative processes into three levels. To validate this framework, we curate CreativeMashup, a high-quality dataset of 666 artist-generated visual mashups annotated according to the IEI framework.
arXiv Detail & Related papers (2025-04-17T17:38:18Z)
A Causality-aware Paradigm for Evaluating Creativity of Multimodal Large Language Models [100.16387798660833]
Oogiri game is a creativity-driven task requiring humor and associative thinking. LoTbench is an interactive, causality-aware evaluation framework. Results show that while most LLMs exhibit constrained creativity, the performance gap between LLMs and humans is not insurmountable.
arXiv Detail & Related papers (2025-01-25T09:11:15Z)
A Framework for Collaborating a Large Language Model Tool in Brainstorming for Triggering Creative Thoughts [2.709166684084394]
This study proposes a framework called GPS, which employs goals, prompts, and strategies to guide designers to systematically work with an LLM tool for improving the creativity of ideas generated during brainstorming. Our framework, tested through a design example and a case study, demonstrates its effectiveness in stimulating creativity and its seamless LLM tool integration into design practices.
arXiv Detail & Related papers (2024-10-10T13:39:27Z)
MetaDesigner: Advancing Artistic Typography through AI-Driven, User-Centric, and Multilingual WordArt Synthesis [65.78359025027457]
MetaDesigner revolutionizes artistic typography by leveraging the strengths of Large Language Models (LLMs) to drive a design paradigm centered around user engagement. A comprehensive feedback mechanism harnesses insights from multimodal models and user evaluations to refine and enhance the design process iteratively. Empirical validations highlight MetaDesigner's capability to effectively serve diverse WordArt applications, consistently producing aesthetically appealing and context-sensitive results.
arXiv Detail & Related papers (2024-06-28T11:58:26Z)
LLM2FEA: Discover Novel Designs with Generative Evolutionary Multitasking [21.237950330178354]
We propose the first attempt to discover novel designs in generative models by transferring knowledge across multiple domains. By utilizing a multi-factorial evolutionary algorithm (MFEA) to drive a large language model, LLM2FEA integrates knowledge from various fields to generate prompts that guide the generative model in discovering novel and practical objects.
arXiv Detail & Related papers (2024-06-21T07:20:51Z)
Creativity Has Left the Chat: The Price of Debiasing Language Models [1.223779595809275]
We investigate the unintended consequences of Reinforcement Learning from Human Feedback on the creativity of Large Language Models (LLMs) Our findings have significant implications for marketers who rely on LLMs for creative tasks such as copywriting, ad creation, and customer persona generation.
arXiv Detail & Related papers (2024-06-08T22:14:51Z)
Divergent Creativity in Humans and Large Language Models [37.67363469600804]
The recent surge in the capabilities of Large Language Models has led to claims that they are approaching a level of creativity akin to human capabilities. We leverage recent advances in creativity science to build a framework for in-depth analysis of divergent creativity in both state-of-the-art LLMs and a substantial dataset of 100,000 humans.
arXiv Detail & Related papers (2024-05-13T22:37:52Z)
LVLM-Interpret: An Interpretability Tool for Large Vision-Language Models [50.259006481656094]
We present a novel interactive application aimed towards understanding the internal mechanisms of large vision-language models. Our interface is designed to enhance the interpretability of the image patches, which are instrumental in generating an answer. We present a case study of how our application can aid in understanding failure mechanisms in a popular large multi-modal model: LLaVA.
arXiv Detail & Related papers (2024-04-03T23:57:34Z)
I-Design: Personalized LLM Interior Designer [57.00412237555167]
I-Design is a personalized interior designer that allows users to generate and visualize their design goals through natural language communication. I-Design starts with a team of large language model agents that engage in dialogues and logical reasoning with one another. The final design is then constructed in 3D by retrieving and integrating assets from an existing object database.
arXiv Detail & Related papers (2024-04-03T16:17:53Z)
Multi-modal Instruction Tuned LLMs with Fine-grained Visual Perception [63.03288425612792]
We propose bfAnyRef, a general MLLM model that can generate pixel-wise object perceptions and natural language descriptions from multi-modality references. Our model achieves state-of-the-art results across multiple benchmarks, including diverse modality referring segmentation and region-level referring expression generation.
arXiv Detail & Related papers (2024-03-05T13:45:46Z)
Towards More Unified In-context Visual Understanding [74.55332581979292]
We present a new ICL framework for visual understanding with multi-modal output enabled. First, we quantize and embed both text and visual prompt into a unified representational space. Then a decoder-only sparse transformer architecture is employed to perform generative modeling on them.
arXiv Detail & Related papers (2023-12-05T06:02:21Z)
ConceptLab: Creative Concept Generation using VLM-Guided Diffusion Prior Constraints [56.824187892204314]
We present the task of creative text-to-image generation, where we seek to generate new members of a broad category. We show that the creative generation problem can be formulated as an optimization process over the output space of the diffusion prior. We incorporate a question-answering Vision-Language Model (VLM) that adaptively adds new constraints to the optimization problem, encouraging the model to discover increasingly more unique creations.
arXiv Detail & Related papers (2023-08-03T17:04:41Z)
How to Prompt? Opportunities and Challenges of Zero- and Few-Shot Learning for Human-AI Interaction in Creative Applications of Generative Models [29.420160518026496]
We discuss the opportunities and challenges for interactive creative applications that use prompting as a new paradigm for Human-AI interaction. Based on our analysis, we propose four design goals for user interfaces that support prompting. We illustrate these with concrete UI design sketches, focusing on the use case of creative writing.
arXiv Detail & Related papers (2022-09-03T10:16:34Z)

This list is automatically generated from the titles and abstracts of the papers in this site.