TaleForge: Interactive Multimodal System for Personalized Story Creation
- URL: http://arxiv.org/abs/2506.21832v1
- Date: Fri, 27 Jun 2025 00:45:38 GMT
- Title: TaleForge: Interactive Multimodal System for Personalized Story Creation
- Authors: Minh-Loi Nguyen, Quang-Khai Le, Tam V. Nguyen, Minh-Triet Tran, Trung-Nghia Le,
- Abstract summary: TaleForge is a personalized story-generation system that embeds users' facial images within both narratives and illustrations.<n>A user study demonstrated heightened engagement and ownership when individuals appeared as protagonists.
- Score: 15.193340794653261
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: Storytelling is a deeply personal and creative process, yet existing methods often treat users as passive consumers, offering generic plots with limited personalization. This undermines engagement and immersion, especially where individual style or appearance is crucial. We introduce TaleForge, a personalized story-generation system that integrates large language models (LLMs) and text-to-image diffusion to embed users' facial images within both narratives and illustrations. TaleForge features three interconnected modules: Story Generation, where LLMs create narratives and character descriptions from user prompts; Personalized Image Generation, merging users' faces and outfit choices into character illustrations; and Background Generation, creating scene backdrops that incorporate personalized characters. A user study demonstrated heightened engagement and ownership when individuals appeared as protagonists. Participants praised the system's real-time previews and intuitive controls, though they requested finer narrative editing tools. TaleForge advances multimodal storytelling by aligning personalized text and imagery to create immersive, user-centric experiences.
Related papers
- StorySage: Conversational Autobiography Writing Powered by a Multi-Agent Framework [40.06696963935616]
StorySage is a user-driven software system designed to meet the needs of a diverse group of users.<n>Our system iteratively collects user memories, updates their autobiography, and plans for future conversations.
arXiv Detail & Related papers (2025-06-17T03:44:47Z) - Action2Dialogue: Generating Character-Centric Narratives from Scene-Level Prompts [20.281732318265483]
We present a modular pipeline that transforms action-level prompts into visually and auditorily grounded narrative dialogue.<n>Our method takes as input a pair of prompts per scene, where the first defines the setting and the second specifies a character's behavior.<n>We render each utterance as expressive, character-consistent speech, resulting in fully-voiced video narratives.
arXiv Detail & Related papers (2025-05-22T15:54:42Z) - Facilitating Video Story Interaction with Multi-Agent Collaborative System [7.7519050921867825]
Our system uses a Vision Language Model (VLM) to enable machines to understand video stories.<n>It combines Retrieval-Augmented Generation (RAG) and a Multi-Agent System (MAS) to create evolving characters and scene experiences.
arXiv Detail & Related papers (2025-05-02T09:08:13Z) - From Panels to Prose: Generating Literary Narratives from Comics [55.544015596503726]
We develop an automated system that generates text-based literary narratives from manga comics.<n>Our approach aims to create an evocative and immersive prose that not only conveys the original narrative but also captures the depth and complexity of characters.
arXiv Detail & Related papers (2025-03-30T07:18:10Z) - IP-Prompter: Training-Free Theme-Specific Image Generation via Dynamic Visual Prompting [71.29100512700064]
IP-Prompter is a novel training-free TSI generation method.<n>It integrates reference images into generative models, allowing users to seamlessly specify the target theme.<n>Our approach enables diverse applications, including consistent story generation, character design, realistic character generation, and style-guided image generation.
arXiv Detail & Related papers (2025-01-26T19:01:19Z) - Imagining from Images with an AI Storytelling Tool [0.27309692684728604]
The proposed method explores the multimodal capabilities of GPT-4o to interpret visual content and create engaging stories.
The method is supported by a fully implemented tool, called ImageTeller, which accepts images from diverse sources as input.
arXiv Detail & Related papers (2024-08-21T10:49:15Z) - Training-Free Consistent Text-to-Image Generation [80.4814768762066]
Text-to-image models can portray the same subject across diverse prompts.
Existing approaches fine-tune the model to teach it new words that describe specific user-provided subjects.
We present ConsiStory, a training-free approach that enables consistent subject generation by sharing the internal activations of the pretrained model.
arXiv Detail & Related papers (2024-02-05T18:42:34Z) - The Chosen One: Consistent Characters in Text-to-Image Diffusion Models [71.15152184631951]
We propose a fully automated solution for consistent character generation with the sole input being a text prompt.
Our method strikes a better balance between prompt alignment and identity consistency compared to the baseline methods.
arXiv Detail & Related papers (2023-11-16T18:59:51Z) - NarrativePlay: Interactive Narrative Understanding [27.440721435864194]
We introduce NarrativePlay, a novel system that allows users to role-play a fictional character and interact with other characters in narratives in an immersive environment.
We leverage Large Language Models (LLMs) to generate human-like responses, guided by personality traits extracted from narratives.
NarrativePlay has been evaluated on two types of narratives, detective and adventure stories, where users can either explore the world or improve their favorability with the narrative characters through conversations.
arXiv Detail & Related papers (2023-10-02T13:24:00Z) - ViNTER: Image Narrative Generation with Emotion-Arc-Aware Transformer [59.05857591535986]
We propose a model called ViNTER to generate image narratives that focus on time series representing varying emotions as "emotion arcs"
We present experimental results of both manual and automatic evaluations.
arXiv Detail & Related papers (2022-02-15T10:53:08Z) - FairyTailor: A Multimodal Generative Framework for Storytelling [33.39639788612019]
We introduce a system and a demo, FairyTailor, for human-in-the-loop visual story co-creation.
Users can create a cohesive children's fairytale by weaving generated texts and retrieved images with their input.
To our knowledge, this is the first dynamic tool for multimodal story generation that allows interactive co-formation of both texts and images.
arXiv Detail & Related papers (2021-07-13T02:45:08Z) - Cue Me In: Content-Inducing Approaches to Interactive Story Generation [74.09575609958743]
We focus on the task of interactive story generation, where the user provides the model mid-level sentence abstractions.
We present two content-inducing approaches to effectively incorporate this additional information.
Experimental results from both automatic and human evaluations show that these methods produce more topically coherent and personalized stories.
arXiv Detail & Related papers (2020-10-20T00:36:15Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.