Spellburst: A Node-based Interface for Exploratory Creative Coding with
Natural Language Prompts
- URL: http://arxiv.org/abs/2308.03921v1
- Date: Mon, 7 Aug 2023 21:54:58 GMT
- Title: Spellburst: A Node-based Interface for Exploratory Creative Coding with
Natural Language Prompts
- Authors: Tyler Angert, Miroslav Ivan Suzara, Jenny Han, Christopher Lawrence
Pondoc, Hariharan Subramonyam
- Abstract summary: Spellburst is a large language model (LLM) powered creative coding environment.
Spellburst allows artists to create generative art and explore variations through branching and merging operations.
- Score: 7.074738009603178
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Creative coding tasks are often exploratory in nature. When producing digital
artwork, artists usually begin with a high-level semantic construct such as a
"stained glass filter" and programmatically implement it by varying code
parameters such as shape, color, lines, and opacity to produce visually
appealing results. Based on interviews with artists, it can be effortful to
translate semantic constructs to program syntax, and current programming tools
don't lend well to rapid creative exploration. To address these challenges, we
introduce Spellburst, a large language model (LLM) powered creative-coding
environment. Spellburst provides (1) a node-based interface that allows artists
to create generative art and explore variations through branching and merging
operations, (2) expressive prompt-based interactions to engage in semantic
programming, and (3) dynamic prompt-driven interfaces and direct code editing
to seamlessly switch between semantic and syntactic exploration. Our evaluation
with artists demonstrates Spellburst's potential to enhance creative coding
practices and inform the design of computational creativity tools that bridge
semantic and syntactic spaces.
Related papers
- Redefining <Creative> in Dictionary: Towards an Enhanced Semantic Understanding of Creative Generation [39.93527514513576]
Current methods rely heavily on reference prompts or images to achieve a creative effect.
We introduce CreTok, which brings meta-creativity to diffusion models by redefining creative' as a new token.
CreTok achieves such redefinition by iteratively sampling diverse text pairs.
arXiv Detail & Related papers (2024-10-31T17:19:03Z) - PartCraft: Crafting Creative Objects by Parts [128.30514851911218]
This paper propels creative control in generative visual AI by allowing users to "select"
We for the first time allow users to choose visual concepts by parts for their creative endeavors.
Fine-grained generation that precisely captures selected visual concepts.
arXiv Detail & Related papers (2024-07-05T15:53:04Z) - MetaDesigner: Advancing Artistic Typography through AI-Driven, User-Centric, and Multilingual WordArt Synthesis [65.78359025027457]
MetaDesigner revolutionizes artistic typography by leveraging the strengths of Large Language Models (LLMs) to drive a design paradigm centered around user engagement.
A comprehensive feedback mechanism harnesses insights from multimodal models and user evaluations to refine and enhance the design process iteratively.
Empirical validations highlight MetaDesigner's capability to effectively serve diverse WordArt applications, consistently producing aesthetically appealing and context-sensitive results.
arXiv Detail & Related papers (2024-06-28T11:58:26Z) - Dynamic Typography: Bringing Text to Life via Video Diffusion Prior [73.72522617586593]
We present an automated text animation scheme, termed "Dynamic Typography"
It deforms letters to convey semantic meaning and infuses them with vibrant movements based on user prompts.
Our technique harnesses vector graphics representations and an end-to-end optimization-based framework.
arXiv Detail & Related papers (2024-04-17T17:59:55Z) - Exploring the Potential of Large Language Models in Artistic Creation:
Collaboration and Reflection on Creative Programming [10.57792673254363]
We compare two common collaboration approaches: invoking the entire program and multiple subtasks.
Our findings exhibit artists' different stimulated reflections in two different methods.
Our work reveals the artistic potential of LLM in creative coding.
arXiv Detail & Related papers (2024-02-15T07:00:06Z) - CreativeSynth: Creative Blending and Synthesis of Visual Arts based on
Multimodal Diffusion [74.44273919041912]
Large-scale text-to-image generative models have made impressive strides, showcasing their ability to synthesize a vast array of high-quality images.
However, adapting these models for artistic image editing presents two significant challenges.
We build the innovative unified framework Creative Synth, which is based on a diffusion model with the ability to coordinate multimodal inputs.
arXiv Detail & Related papers (2024-01-25T10:42:09Z) - DrawTalking: Building Interactive Worlds by Sketching and Speaking [19.421582154948627]
We introduce DrawTalking, an approach to building and controlling interactive worlds by sketching and speaking while telling stories.
It emphasizes user control and flexibility, and gives programming-like capability without requiring code.
arXiv Detail & Related papers (2024-01-11T03:02:17Z) - Creative Agents: Empowering Agents with Imagination for Creative Tasks [31.920963353890393]
We propose a class of solutions for creative agents, where the controller is enhanced with an imaginator that generates detailed imaginations of task outcomes conditioned on language instructions.
We benchmark creative tasks with the challenging open-world game Minecraft, where the agents are asked to create diverse buildings given free-form language instructions.
We perform a detailed experimental analysis of creative agents, showing that creative agents are the first AI agents accomplishing diverse building creation in the survival mode of Minecraft.
arXiv Detail & Related papers (2023-12-05T06:00:52Z) - Structure-Guided Image Completion with Image-level and Object-level Semantic Discriminators [97.12135238534628]
We propose a learning paradigm that consists of semantic discriminators and object-level discriminators for improving the generation of complex semantics and objects.
Specifically, the semantic discriminators leverage pretrained visual features to improve the realism of the generated visual concepts.
Our proposed scheme significantly improves the generation quality and achieves state-of-the-art results on various tasks.
arXiv Detail & Related papers (2022-12-13T01:36:56Z) - IR-GAN: Image Manipulation with Linguistic Instruction by Increment
Reasoning [110.7118381246156]
Increment Reasoning Generative Adversarial Network (IR-GAN) aims to reason consistency between visual increment in images and semantic increment in instructions.
First, we introduce the word-level and instruction-level instruction encoders to learn user's intention from history-correlated instructions as semantic increment.
Second, we embed the representation of semantic increment into that of source image for generating target image, where source image plays the role of referring auxiliary.
arXiv Detail & Related papers (2022-04-02T07:48:39Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.