DrawTalking: Building Interactive Worlds by Sketching and Speaking
- URL: http://arxiv.org/abs/2401.05631v4
- Date: Mon, 5 Aug 2024 03:46:34 GMT
- Title: DrawTalking: Building Interactive Worlds by Sketching and Speaking
- Authors: Karl Toby Rosenberg, Rubaiat Habib Kazi, Li-Yi Wei, Haijun Xia, Ken Perlin,
- Abstract summary: We introduce DrawTalking, an approach to building and controlling interactive worlds by sketching and speaking while telling stories.
It emphasizes user control and flexibility, and gives programming-like capability without requiring code.
- Score: 19.421582154948627
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce DrawTalking, an approach to building and controlling interactive worlds by sketching and speaking while telling stories. It emphasizes user control and flexibility, and gives programming-like capability without requiring code. An early open-ended study with our prototype shows that the mechanics resonate and are applicable to many creative-exploratory use cases, with the potential to inspire and inform research in future natural interfaces for creative exploration and authoring.
Related papers
- SketchAgent: Language-Driven Sequential Sketch Generation [34.96339247291013]
SketchAgent is a language-driven, sequential sketch generation method.
We present an intuitive sketching language, introduced to the model through in-context examples.
By drawing stroke by stroke, our agent captures the evolving, dynamic qualities intrinsic to sketching.
arXiv Detail & Related papers (2024-11-26T18:32:06Z) - Sketch2Code: Evaluating Vision-Language Models for Interactive Web Design Prototyping [55.98643055756135]
We introduce Sketch2Code, a benchmark that evaluates state-of-the-art Vision Language Models (VLMs) on automating the conversion of rudimentary sketches into webpage prototypes.
We analyze ten commercial and open-source models, showing that Sketch2Code is challenging for existing VLMs.
A user study with UI/UX experts reveals a significant preference for proactive question-asking over passive feedback reception.
arXiv Detail & Related papers (2024-10-21T17:39:49Z) - Introducing Brain-like Concepts to Embodied Hand-crafted Dialog Management System [1.178527785547223]
This paper presents a neural behavior engine that allows creation of mixed initiative dialog and action generation based on hand-crafted models using a graphical language.
A demonstration of the usability of such brain-like architecture is described through a virtual receptionist application running on a semi-public space.
arXiv Detail & Related papers (2024-06-13T10:54:03Z) - SketchDreamer: Interactive Text-Augmented Creative Sketch Ideation [111.2195741547517]
We present a method to generate controlled sketches using a text-conditioned diffusion model trained on pixel representations of images.
Our objective is to empower non-professional users to create sketches and, through a series of optimisation processes, transform a narrative into a storyboard.
arXiv Detail & Related papers (2023-08-27T19:44:44Z) - Spellburst: A Node-based Interface for Exploratory Creative Coding with
Natural Language Prompts [7.074738009603178]
Spellburst is a large language model (LLM) powered creative coding environment.
Spellburst allows artists to create generative art and explore variations through branching and merging operations.
arXiv Detail & Related papers (2023-08-07T21:54:58Z) - PromptCrafter: Crafting Text-to-Image Prompt through Mixed-Initiative
Dialogue with LLM [2.2894985490441377]
We present PromptCrafter, a novel mixed-initiative system that allows step-by-step crafting of text-to-image prompt.
Through the iterative process, users can efficiently explore the model's capability, and clarify their intent.
PromptCrafter also supports users to refine prompts by answering various responses to clarifying questions generated by a Large Language Model.
arXiv Detail & Related papers (2023-07-18T05:51:00Z) - ChatPLUG: Open-Domain Generative Dialogue System with Internet-Augmented
Instruction Tuning for Digital Human [76.62897301298699]
ChatPLUG is a Chinese open-domain dialogue system for digital human applications that instruction finetunes on a wide range of dialogue tasks in a unified internet-augmented format.
We show that modelname outperforms state-of-the-art Chinese dialogue systems on both automatic and human evaluation.
We deploy modelname to real-world applications such as Smart Speaker and Instant Message applications with fast inference.
arXiv Detail & Related papers (2023-04-16T18:16:35Z) - Generative Agents: Interactive Simulacra of Human Behavior [86.1026716646289]
We introduce generative agents--computational software agents that simulate believable human behavior.
We describe an architecture that extends a large language model to store a complete record of the agent's experiences.
We instantiate generative agents to populate an interactive sandbox environment inspired by The Sims.
arXiv Detail & Related papers (2023-04-07T01:55:19Z) - Teachable Reality: Prototyping Tangible Augmented Reality with Everyday
Objects by Leveraging Interactive Machine Teaching [4.019017835137353]
Teachable Reality is an augmented reality (AR) prototyping tool for creating interactive tangible AR applications with arbitrary everyday objects.
It identifies the user-defined tangible and gestural interactions using an on-demand computer vision model.
Our approach can lower the barrier to creating functional AR prototypes while also allowing flexible and general-purpose prototyping experiences.
arXiv Detail & Related papers (2023-02-21T23:03:49Z) - A Case Study in Engineering a Conversational Programming Assistant's
Persona [72.47187215119664]
Conversational capability was achieved by using an existing code-fluent Large Language Model.
A discussion of the evolution of the prompt provides a case study in how to coax an existing foundation model to behave in a desirable manner for a particular application.
arXiv Detail & Related papers (2023-01-13T14:48:47Z) - I Know What You Draw: Learning Grasp Detection Conditioned on a Few
Freehand Sketches [74.63313641583602]
We propose a method to generate a potential grasp configuration relevant to the sketch-depicted objects.
Our model is trained and tested in an end-to-end manner which is easy to be implemented in real-world applications.
arXiv Detail & Related papers (2022-05-09T04:23:36Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.