Novel Chapter Abstractive Summarization using Spinal Tree Aware
Sub-Sentential Content Selection
- URL: http://arxiv.org/abs/2211.04903v1
- Date: Wed, 9 Nov 2022 14:12:09 GMT
- Title: Novel Chapter Abstractive Summarization using Spinal Tree Aware
Sub-Sentential Content Selection
- Authors: Hardy Hardy, Miguel Ballesteros, Faisal Ladhak, Muhammad Khalifa,
Vittorio Castelli, Kathleen McKeown
- Abstract summary: We present a pipelined extractive-abstractive approach to summarizing novel chapters.
We show an improvement of 3.71 Rouge-1 points over best results reported in prior work on an existing novel chapter dataset.
- Score: 29.30939223344407
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Summarizing novel chapters is a difficult task due to the input length and
the fact that sentences that appear in the desired summaries draw content from
multiple places throughout the chapter. We present a pipelined
extractive-abstractive approach where the extractive step filters the content
that is passed to the abstractive component. Extremely lengthy input also
results in a highly skewed dataset towards negative instances for extractive
summarization; we thus adopt a margin ranking loss for extraction to encourage
separation between positive and negative examples. Our extraction component
operates at the constituent level; our approach to this problem enriches the
text with spinal tree information which provides syntactic context (in the form
of constituents) to the extraction model. We show an improvement of 3.71
Rouge-1 points over best results reported in prior work on an existing novel
chapter dataset.
Related papers
- Towards Enhancing Coherence in Extractive Summarization: Dataset and Experiments with LLMs [70.15262704746378]
We propose a systematically created human-annotated dataset consisting of coherent summaries for five publicly available datasets and natural language user feedback.
Preliminary experiments with Falcon-40B and Llama-2-13B show significant performance improvements (10% Rouge-L) in terms of producing coherent summaries.
arXiv Detail & Related papers (2024-07-05T20:25:04Z) - Document Summarization with Text Segmentation [7.954814600961461]
We exploit the innate document segment structure for improving the extractive summarization task.
We build two text segmentation models and find the most optimal strategy to introduce their output predictions.
arXiv Detail & Related papers (2023-01-20T22:24:22Z) - Salience Allocation as Guidance for Abstractive Summarization [61.31826412150143]
We propose a novel summarization approach with a flexible and reliable salience guidance, namely SEASON (SaliencE Allocation as Guidance for Abstractive SummarizatiON)
SEASON utilizes the allocation of salience expectation to guide abstractive summarization and adapts well to articles in different abstractiveness.
arXiv Detail & Related papers (2022-10-22T02:13:44Z) - A Survey on Neural Abstractive Summarization Methods and Factual
Consistency of Summarization [18.763290930749235]
summarization is the process of shortening a set of textual data computationally, to create a subset (a summary)
Existing summarization methods can be roughly divided into two types: extractive and abstractive.
An extractive summarizer explicitly selects text snippets from the source document, while an abstractive summarizer generates novel text snippets to convey the most salient concepts prevalent in the source.
arXiv Detail & Related papers (2022-04-20T14:56:36Z) - StreamHover: Livestream Transcript Summarization and Annotation [54.41877742041611]
We present StreamHover, a framework for annotating and summarizing livestream transcripts.
With a total of over 500 hours of videos annotated with both extractive and abstractive summaries, our benchmark dataset is significantly larger than currently existing annotated corpora.
We show that our model generalizes better and improves performance over strong baselines.
arXiv Detail & Related papers (2021-09-11T02:19:37Z) - Exploring Content Selection in Summarization of Novel Chapters [19.11830806780343]
We present a new summarization task, generating summaries of novel chapters using summary/chapter pairs from online study guides.
This is a harder task than the news summarization task, given the chapter length as well as the extreme paraphrasing and generalization found in the summaries.
We focus on extractive summarization, which requires the creation of a gold-standard set of extractive summaries.
arXiv Detail & Related papers (2020-05-04T20:45:39Z) - Extractive Summarization as Text Matching [123.09816729675838]
This paper creates a paradigm shift with regard to the way we build neural extractive summarization systems.
We formulate the extractive summarization task as a semantic text matching problem.
We have driven the state-of-the-art extractive result on CNN/DailyMail to a new level (44.41 in ROUGE-1)
arXiv Detail & Related papers (2020-04-19T08:27:57Z) - At Which Level Should We Extract? An Empirical Analysis on Extractive
Document Summarization [110.54963847339775]
We show that unnecessity and redundancy issues exist when extracting full sentences.
We propose extracting sub-sentential units based on the constituency parsing tree.
arXiv Detail & Related papers (2020-04-06T13:35:10Z) - The Shmoop Corpus: A Dataset of Stories with Loosely Aligned Summaries [72.48439126769627]
We introduce the Shmoop Corpus: a dataset of 231 stories paired with detailed multi-paragraph summaries for each individual chapter.
From the corpus, we construct a set of common NLP tasks, including Cloze-form question answering and a simplified form of abstractive summarization.
We believe that the unique structure of this corpus provides an important foothold towards making machine story comprehension more approachable.
arXiv Detail & Related papers (2019-12-30T21:03:59Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.