TVRecap: A Dataset for Generating Stories with Character Descriptions
- URL: http://arxiv.org/abs/2109.08833v1
- Date: Sat, 18 Sep 2021 05:02:29 GMT
- Title: TVRecap: A Dataset for Generating Stories with Character Descriptions
- Authors: Mingda Chen, Kevin Gimpel
- Abstract summary: TVRecap is a story generation dataset that generates detailed TV show episode recaps from a brief summary and documents describing the characters involved.
We create TVRecap from fan-contributed websites, which allows us to collect 26k episode recaps with 1868.7 tokens on average.
- Score: 43.198875830024825
- License: http://creativecommons.org/licenses/by-nc-sa/4.0/
- Abstract: We introduce TVRecap, a story generation dataset that requires generating
detailed TV show episode recaps from a brief summary and a set of documents
describing the characters involved. Unlike other story generation datasets,
TVRecap contains stories that are authored by professional screenwriters and
that feature complex interactions among multiple characters. Generating stories
in TVRecap requires drawing relevant information from the lengthy provided
documents about characters based on the brief summary. In addition, by swapping
the input and output, TVRecap can serve as a challenging testbed for
abstractive summarization. We create TVRecap from fan-contributed websites,
which allows us to collect 26k episode recaps with 1868.7 tokens on average.
Empirically, we take a hierarchical story generation approach and find that the
neural model that uses oracle content selectors for character descriptions
demonstrates the best performance on automatic metrics, showing the potential
of our dataset to inspire future research on story generation with constraints.
Qualitative analysis shows that the best-performing model sometimes generates
content that is unfaithful to the short summaries, suggesting promising
directions for future work.
Related papers
- ScreenWriter: Automatic Screenplay Generation and Movie Summarisation [55.20132267309382]
Video content has driven demand for textual descriptions or summaries that allow users to recall key plot points or get an overview without watching.
We propose the task of automatic screenplay generation, and a method, ScreenWriter, that operates only on video and produces output which includes dialogue, speaker names, scene breaks, and visual descriptions.
ScreenWriter introduces a novel algorithm to segment the video into scenes based on the sequence of visual vectors, and a novel method for the challenging problem of determining character names, based on a database of actors' faces.
arXiv Detail & Related papers (2024-10-17T07:59:54Z) - Generating Visual Stories with Grounded and Coreferent Characters [63.07511918366848]
We present the first model capable of predicting visual stories with consistently grounded and coreferent character mentions.
Our model is finetuned on a new dataset which we build on top of the widely used VIST benchmark.
We also propose new evaluation metrics to measure the richness of characters and coreference in stories.
arXiv Detail & Related papers (2024-09-20T14:56:33Z) - "Previously on ..." From Recaps to Story Summarization [13.311411816150551]
We introduce multimodal story summarization by leveraging TV episode recaps.
Story summarization labels are unlocked by matching recap shots to corresponding sub-stories in the episode.
We present a thorough evaluation on story summarization, including promising cross-series generalization.
arXiv Detail & Related papers (2024-05-19T09:09:54Z) - Detecting and Grounding Important Characters in Visual Stories [18.870236356616907]
We introduce the VIST-Character dataset, which provides rich character-centric annotations.
Based on this dataset, we propose two new tasks: important character detection and character grounding in visual stories.
We develop simple, unsupervised models based on distributional similarity and pre-trained vision-and-language models.
arXiv Detail & Related papers (2023-03-30T18:24:06Z) - VideoXum: Cross-modal Visual and Textural Summarization of Videos [54.0985975755278]
We propose a new joint video and text summarization task.
The goal is to generate both a shortened video clip along with the corresponding textual summary from a long video.
The generated shortened video clip and text narratives should be semantically well aligned.
arXiv Detail & Related papers (2023-03-21T17:51:23Z) - NarraSum: A Large-Scale Dataset for Abstractive Narrative Summarization [26.80378373420446]
NarraSum is a large-scale narrative summarization dataset.
It contains 122K narrative documents, which are collected from plot descriptions of movies and TV episodes with diverse genres, and their corresponding abstractive summaries.
Experiments show that there is a large performance gap between humans and the state-of-the-art summarization models on NarraSum.
arXiv Detail & Related papers (2022-12-02T22:51:51Z) - StoryDALL-E: Adapting Pretrained Text-to-Image Transformers for Story
Continuation [76.44802273236081]
We develop a model StoryDALL-E for story continuation, where the generated visual story is conditioned on a source image.
We show that our retro-fitting approach outperforms GAN-based models for story continuation and facilitates copying of visual elements from the source image.
Overall, our work demonstrates that pretrained text-to-image synthesis models can be adapted for complex and low-resource tasks like story continuation.
arXiv Detail & Related papers (2022-09-13T17:47:39Z) - SummScreen: A Dataset for Abstractive Screenplay Summarization [52.56760815805357]
SummScreen is a dataset comprised of pairs of TV series transcripts and human written recaps.
Plot details are often expressed indirectly in character dialogues and may be scattered across the entirety of the transcript.
Since characters are fundamental to TV series, we also propose two entity-centric evaluation metrics.
arXiv Detail & Related papers (2021-04-14T19:37:40Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.