FairyTailor: A Multimodal Generative Framework for Storytelling
- URL: http://arxiv.org/abs/2108.04324v1
- Date: Tue, 13 Jul 2021 02:45:08 GMT
- Title: FairyTailor: A Multimodal Generative Framework for Storytelling
- Authors: Eden Bensaid, Mauro Martino, Benjamin Hoover, Jacob Andreas and
Hendrik Strobelt
- Abstract summary: We introduce a system and a demo, FairyTailor, for human-in-the-loop visual story co-creation.
Users can create a cohesive children's fairytale by weaving generated texts and retrieved images with their input.
To our knowledge, this is the first dynamic tool for multimodal story generation that allows interactive co-formation of both texts and images.
- Score: 33.39639788612019
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Storytelling is an open-ended task that entails creative thinking and
requires a constant flow of ideas. Natural language generation (NLG) for
storytelling is especially challenging because it requires the generated text
to follow an overall theme while remaining creative and diverse to engage the
reader. In this work, we introduce a system and a web-based demo, FairyTailor,
for human-in-the-loop visual story co-creation. Users can create a cohesive
children's fairytale by weaving generated texts and retrieved images with their
input. FairyTailor adds another modality and modifies the text generation
process to produce a coherent and creative sequence of text and images. To our
knowledge, this is the first dynamic tool for multimodal story generation that
allows interactive co-formation of both texts and images. It allows users to
give feedback on co-created stories and share their results.
Related papers
- A Character-Centric Creative Story Generation via Imagination [15.345466372805516]
We introduce a novel story generation framework called CCI (Character-centric Creative story generation via Imagination)
CCI features two modules for creative story generation: IG (Image-Guided Imagination) and MW (Multi-Writer model)
In the IG module, we utilize a text-to-image model to create visual representations of key story elements, such as characters, backgrounds, and main plots.
The MW module uses these story elements to generate multiple persona-description candidates and selects the best one to insert into the story, thereby enhancing the richness and depth of the narrative.
arXiv Detail & Related papers (2024-09-25T06:54:29Z) - The Art of Storytelling: Multi-Agent Generative AI for Dynamic Multimodal Narratives [3.5001789247699535]
This paper introduces the concept of an education tool that utilizes Generative Artificial Intelligence (GenAI) to enhance storytelling for children.
The system combines GenAI-driven narrative co-creation, text-to-speech conversion, and text-to-video generation to produce an engaging experience for learners.
arXiv Detail & Related papers (2024-09-17T15:10:23Z) - SEED-Story: Multimodal Long Story Generation with Large Language Model [66.37077224696242]
SEED-Story is a novel method that leverages a Multimodal Large Language Model (MLLM) to generate extended multimodal stories.
We propose multimodal attention sink mechanism to enable the generation of stories with up to 25 sequences (only 10 for training) in a highly efficient autoregressive manner.
We present a large-scale and high-resolution dataset named StoryStream for training our model and quantitatively evaluating the task of multimodal story generation in various aspects.
arXiv Detail & Related papers (2024-07-11T17:21:03Z) - Text-Only Training for Visual Storytelling [107.19873669536523]
We formulate visual storytelling as a visual-conditioned story generation problem.
We propose a text-only training method that separates the learning of cross-modality alignment and story generation.
arXiv Detail & Related papers (2023-08-17T09:32:17Z) - Intelligent Grimm -- Open-ended Visual Storytelling via Latent Diffusion
Models [70.86603627188519]
We focus on a novel, yet challenging task of generating a coherent image sequence based on a given storyline, denoted as open-ended visual storytelling.
We propose a learning-based auto-regressive image generation model, termed as StoryGen, with a novel vision-language context module.
We show StoryGen can generalize to unseen characters without any optimization, and generate image sequences with coherent content and consistent character.
arXiv Detail & Related papers (2023-06-01T17:58:50Z) - Visual Story Generation Based on Emotion and Keywords [5.3860505447668015]
This work proposes a story generation pipeline to co-create visual stories with the users.
The pipeline includes two parts: narrative and image generation.
arXiv Detail & Related papers (2023-01-07T03:56:49Z) - Visualize Before You Write: Imagination-Guided Open-Ended Text
Generation [68.96699389728964]
We propose iNLG that uses machine-generated images to guide language models in open-ended text generation.
Experiments and analyses demonstrate the effectiveness of iNLG on open-ended text generation tasks.
arXiv Detail & Related papers (2022-10-07T18:01:09Z) - Event Transition Planning for Open-ended Text Generation [55.729259805477376]
Open-ended text generation tasks require models to generate a coherent continuation given limited preceding context.
We propose a novel two-stage method which explicitly arranges the ensuing events in open-ended text generation.
Our approach can be understood as a specially-trained coarse-to-fine algorithm.
arXiv Detail & Related papers (2022-04-20T13:37:51Z) - Telling Creative Stories Using Generative Visual Aids [52.623545341588304]
We asked writers to write creative stories from a starting prompt, and provided them with visuals created by generative AI models from the same prompt.
Compared to a control group, writers who used the visuals as story writing aid wrote significantly more creative, original, complete and visualizable stories.
Findings indicate that cross modality inputs by AI can benefit divergent aspects of creativity in human-AI co-creation, but hinders convergent thinking.
arXiv Detail & Related papers (2021-10-27T23:13:47Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.