Plot and Rework: Modeling Storylines for Visual Storytelling
- URL: http://arxiv.org/abs/2105.06950v1
- Date: Fri, 14 May 2021 16:41:29 GMT
- Title: Plot and Rework: Modeling Storylines for Visual Storytelling
- Authors: Chi-Yang Hsu, Yun-Wei Chu, Ting-Hao (Kenneth) Huang, Lun-Wei Ku
- Abstract summary: This paper introduces PR-VIST, a framework that represents the input image sequence as a story graph in which it finds the best path to form a storyline.
PR-VIST learns to generate the final story via an iterative training process.
An ablation study shows that both plotting and reworking contribute to the model's superiority.
- Score: 12.353812582863837
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Writing a coherent and engaging story is not easy. Creative writers use their
knowledge and worldview to put disjointed elements together to form a coherent
storyline, and work and rework iteratively toward perfection. Automated visual
storytelling (VIST) models, however, make poor use of external knowledge and
iterative generation when attempting to create stories. This paper introduces
PR-VIST, a framework that represents the input image sequence as a story graph
in which it finds the best path to form a storyline. PR-VIST then takes this
path and learns to generate the final story via an iterative training process.
This framework produces stories that are superior in terms of diversity,
coherence, and humanness, per both automatic and human evaluations. An ablation
study shows that both plotting and reworking contribute to the model's
superiority.
Related papers
- Generating Visual Stories with Grounded and Coreferent Characters [63.07511918366848]
We present the first model capable of predicting visual stories with consistently grounded and coreferent character mentions.
Our model is finetuned on a new dataset which we build on top of the widely used VIST benchmark.
We also propose new evaluation metrics to measure the richness of characters and coreference in stories.
arXiv Detail & Related papers (2024-09-20T14:56:33Z) - StoryImager: A Unified and Efficient Framework for Coherent Story Visualization and Completion [78.1014542102578]
Story visualization aims to generate realistic and coherent images based on a storyline.
Current models adopt a frame-by-frame architecture by transforming the pre-trained text-to-image model into an auto-regressive manner.
We propose a bidirectional, unified, and efficient framework, namely StoryImager.
arXiv Detail & Related papers (2024-04-09T03:22:36Z) - TARN-VIST: Topic Aware Reinforcement Network for Visual Storytelling [14.15543866199545]
As a cross-modal task, visual storytelling aims to generate a story for an ordered image sequence automatically.
We propose a novel method, Topic Aware Reinforcement Network for VIsual StoryTelling (TARN-VIST)
In particular, we pre-extracted the topic information of stories from both visual and linguistic perspectives.
arXiv Detail & Related papers (2024-03-18T08:01:23Z) - SCO-VIST: Social Interaction Commonsense Knowledge-based Visual
Storytelling [12.560014305032437]
This paper introduces SCO-VIST, a framework representing the image sequence as a graph with objects and relations.
SCO-VIST then takes this graph representing plot points and creates bridges between plot points with semantic and occurrence-based edge weights.
This weighted story graph produces the storyline in a sequence of events using Floyd-Warshall's algorithm.
arXiv Detail & Related papers (2024-02-01T04:09:17Z) - Visual Storytelling with Question-Answer Plans [70.89011289754863]
We present a novel framework which integrates visual representations with pretrained language models and planning.
Our model translates the image sequence into a visual prefix, a sequence of continuous embeddings which language models can interpret.
It also leverages a sequence of question-answer pairs as a blueprint plan for selecting salient visual concepts and determining how they should be assembled into a narrative.
arXiv Detail & Related papers (2023-10-08T21:45:34Z) - Intelligent Grimm -- Open-ended Visual Storytelling via Latent Diffusion
Models [70.86603627188519]
We focus on a novel, yet challenging task of generating a coherent image sequence based on a given storyline, denoted as open-ended visual storytelling.
We propose a learning-based auto-regressive image generation model, termed as StoryGen, with a novel vision-language context module.
We show StoryGen can generalize to unseen characters without any optimization, and generate image sequences with coherent content and consistent character.
arXiv Detail & Related papers (2023-06-01T17:58:50Z) - StoryDALL-E: Adapting Pretrained Text-to-Image Transformers for Story
Continuation [76.44802273236081]
We develop a model StoryDALL-E for story continuation, where the generated visual story is conditioned on a source image.
We show that our retro-fitting approach outperforms GAN-based models for story continuation and facilitates copying of visual elements from the source image.
Overall, our work demonstrates that pretrained text-to-image synthesis models can be adapted for complex and low-resource tasks like story continuation.
arXiv Detail & Related papers (2022-09-13T17:47:39Z) - Guiding Neural Story Generation with Reader Models [5.935317028008691]
We introduce Story generation with Reader Models (StoRM), a framework in which a reader model is used to reason about the story should progress.
Experiments show that our model produces significantly more coherent and on-topic stories, outperforming baselines in dimensions including plot plausibility and staying on topic.
arXiv Detail & Related papers (2021-12-16T03:44:01Z) - PlotMachines: Outline-Conditioned Generation with Dynamic Plot State
Tracking [128.76063992147016]
We present PlotMachines, a neural narrative model that learns to transform an outline into a coherent story by tracking the dynamic plot states.
In addition, we enrich PlotMachines with high-level discourse structure so that the model can learn different writing styles corresponding to different parts of the narrative.
arXiv Detail & Related papers (2020-04-30T17:16:31Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.