NarraSum: A Large-Scale Dataset for Abstractive Narrative Summarization
- URL: http://arxiv.org/abs/2212.01476v2
- Date: Wed, 28 Jun 2023 04:08:20 GMT
- Title: NarraSum: A Large-Scale Dataset for Abstractive Narrative Summarization
- Authors: Chao Zhao, Faeze Brahman, Kaiqiang Song, Wenlin Yao, Dian Yu, Snigdha
Chaturvedi
- Abstract summary: NarraSum is a large-scale narrative summarization dataset.
It contains 122K narrative documents, which are collected from plot descriptions of movies and TV episodes with diverse genres, and their corresponding abstractive summaries.
Experiments show that there is a large performance gap between humans and the state-of-the-art summarization models on NarraSum.
- Score: 26.80378373420446
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Narrative summarization aims to produce a distilled version of a narrative to
describe its most salient events and characters. Summarizing a narrative is
challenging as it requires an understanding of event causality and character
behaviors. To encourage research in this direction, we propose NarraSum, a
large-scale narrative summarization dataset. It contains 122K narrative
documents, which are collected from plot descriptions of movies and TV episodes
with diverse genres, and their corresponding abstractive summaries. Experiments
show that there is a large performance gap between humans and the
state-of-the-art summarization models on NarraSum. We hope that this dataset
will promote future research in summarization, as well as broader studies of
natural language understanding and generation. The dataset is available at
https://github.com/zhaochaocs/narrasum.
Related papers
- Generating Visual Stories with Grounded and Coreferent Characters [63.07511918366848]
We present the first model capable of predicting visual stories with consistently grounded and coreferent character mentions.
Our model is finetuned on a new dataset which we build on top of the widely used VIST benchmark.
We also propose new evaluation metrics to measure the richness of characters and coreference in stories.
arXiv Detail & Related papers (2024-09-20T14:56:33Z) - VideoXum: Cross-modal Visual and Textural Summarization of Videos [54.0985975755278]
We propose a new joint video and text summarization task.
The goal is to generate both a shortened video clip along with the corresponding textual summary from a long video.
The generated shortened video clip and text narratives should be semantically well aligned.
arXiv Detail & Related papers (2023-03-21T17:51:23Z) - Synopses of Movie Narratives: a Video-Language Dataset for Story
Understanding [13.52545041750095]
We release a video-language story dataset, Synopses of Movie Narratives (SyMoN), containing 5,193 video summaries of popular movies and TV series with a total length of 869 hours.
SyMoN captures naturalistic storytelling videos made by human creators and intended for a human audience.
arXiv Detail & Related papers (2022-03-11T01:45:33Z) - TVRecap: A Dataset for Generating Stories with Character Descriptions [43.198875830024825]
TVRecap is a story generation dataset that generates detailed TV show episode recaps from a brief summary and documents describing the characters involved.
We create TVRecap from fan-contributed websites, which allows us to collect 26k episode recaps with 1868.7 tokens on average.
arXiv Detail & Related papers (2021-09-18T05:02:29Z) - SummScreen: A Dataset for Abstractive Screenplay Summarization [52.56760815805357]
SummScreen is a dataset comprised of pairs of TV series transcripts and human written recaps.
Plot details are often expressed indirectly in character dialogues and may be scattered across the entirety of the transcript.
Since characters are fundamental to TV series, we also propose two entity-centric evaluation metrics.
arXiv Detail & Related papers (2021-04-14T19:37:40Z) - Abstractive Summarization of Spoken and Written Instructions with BERT [66.14755043607776]
We present the first application of the BERTSum model to conversational language.
We generate abstractive summaries of narrated instructional videos across a wide variety of topics.
We envision this integrated as a feature in intelligent virtual assistants, enabling them to summarize both written and spoken instructional content upon request.
arXiv Detail & Related papers (2020-08-21T20:59:34Z) - PlotMachines: Outline-Conditioned Generation with Dynamic Plot State
Tracking [128.76063992147016]
We present PlotMachines, a neural narrative model that learns to transform an outline into a coherent story by tracking the dynamic plot states.
In addition, we enrich PlotMachines with high-level discourse structure so that the model can learn different writing styles corresponding to different parts of the narrative.
arXiv Detail & Related papers (2020-04-30T17:16:31Z) - Screenplay Summarization Using Latent Narrative Structure [78.45316339164133]
We propose to explicitly incorporate the underlying structure of narratives into general unsupervised and supervised extractive summarization models.
We formalize narrative structure in terms of key narrative events (turning points) and treat it as latent in order to summarize screenplays.
Experimental results on the CSI corpus of TV screenplays, which we augment with scene-level summarization labels, show that latent turning points correlate with important aspects of a CSI episode.
arXiv Detail & Related papers (2020-04-27T11:54:19Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.