`Keep it Together': Enforcing Cohesion in Extractive Summaries by
Simulating Human Memory
- URL: http://arxiv.org/abs/2402.10643v1
- Date: Fri, 16 Feb 2024 12:43:26 GMT
- Title: `Keep it Together': Enforcing Cohesion in Extractive Summaries by
Simulating Human Memory
- Authors: Ronald Cardenas and Matthias Galle and Shay B. Cohen
- Abstract summary: In this paper, we aim to enforce cohesion whilst controlling for informativeness and redundancy in summaries.
Our sentence selector simulates human memory to keep track of topics.
It is possible to extract highly cohesive summaries that nevertheless read as informative to humans.
- Score: 22.659031563705245
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Extractive summaries are usually presented as lists of sentences with no
expected cohesion between them. In this paper, we aim to enforce cohesion
whilst controlling for informativeness and redundancy in summaries, in cases
where the input exhibits high redundancy. The pipeline controls for redundancy
in long inputs as it is consumed, and balances informativeness and cohesion
during sentence selection. Our sentence selector simulates human memory to keep
track of topics --modeled as lexical chains--, enforcing cohesive ties between
noun phrases. Across a variety of domains, our experiments revealed that it is
possible to extract highly cohesive summaries that nevertheless read as
informative to humans as summaries extracted by only accounting for
informativeness or redundancy. The extracted summaries exhibit smooth topic
transitions between sentences as signaled by lexical chains, with chains
spanning adjacent or near-adjacent sentences.
Related papers
- Attributable and Scalable Opinion Summarization [79.87892048285819]
We generate abstractive summaries by decoding frequent encodings, and extractive summaries by selecting the sentences assigned to the same frequent encodings.
Our method is attributable, because the model identifies sentences used to generate the summary as part of the summarization process.
It scales easily to many hundreds of input reviews, because aggregation is performed in the latent space rather than over long sequences of tokens.
arXiv Detail & Related papers (2023-05-19T11:30:37Z) - On the Trade-off between Redundancy and Local Coherence in Summarization [20.16107829497668]
We investigate the trade-offs incurred when aiming to control for inter-sentential cohesion and redundancy in extracted summaries.
We find that the proposed unsupervised systems manage to extract highly cohesive summaries across varying levels of document redundancy.
arXiv Detail & Related papers (2022-05-20T14:10:28Z) - SNaC: Coherence Error Detection for Narrative Summarization [73.48220043216087]
We introduce SNaC, a narrative coherence evaluation framework rooted in fine-grained annotations for long summaries.
We develop a taxonomy of coherence errors in generated narrative summaries and collect span-level annotations for 6.6k sentences across 150 book and movie screenplay summaries.
Our work provides the first characterization of coherence errors generated by state-of-the-art summarization models and a protocol for eliciting coherence judgments from crowd annotators.
arXiv Detail & Related papers (2022-05-19T16:01:47Z) - A Survey on Neural Abstractive Summarization Methods and Factual
Consistency of Summarization [18.763290930749235]
summarization is the process of shortening a set of textual data computationally, to create a subset (a summary)
Existing summarization methods can be roughly divided into two types: extractive and abstractive.
An extractive summarizer explicitly selects text snippets from the source document, while an abstractive summarizer generates novel text snippets to convey the most salient concepts prevalent in the source.
arXiv Detail & Related papers (2022-04-20T14:56:36Z) - Extractive Summarization of Call Transcripts [77.96603959765577]
This paper presents an indigenously developed method that combines topic modeling and sentence selection with punctuation restoration in ill-punctuated or un-punctuated call transcripts.
Extensive testing, evaluation and comparisons have demonstrated the efficacy of this summarizer for call transcript summarization.
arXiv Detail & Related papers (2021-03-19T02:40:59Z) - Unsupervised Extractive Summarization using Pointwise Mutual Information [5.544401446569243]
We propose new metrics of relevance and redundancy using pointwise mutual information (PMI) between sentences.
We show that our method outperforms similarity-based methods on datasets in a range of domains including news, medical journal articles, and personal anecdotes.
arXiv Detail & Related papers (2021-02-11T21:05:50Z) - Relation Clustering in Narrative Knowledge Graphs [71.98234178455398]
relational sentences in the original text are embedded (with SBERT) and clustered in order to merge together semantically similar relations.
Preliminary tests show that such clustering might successfully detect similar relations, and provide a valuable preprocessing for semi-supervised approaches.
arXiv Detail & Related papers (2020-11-27T10:43:04Z) - Understanding Points of Correspondence between Sentences for Abstractive
Summarization [39.7404761923196]
We present an investigation into fusing sentences drawn from a document by introducing the notion of points of correspondence.
We create a dataset containing the documents, source and fusion sentences, and human annotations of points of correspondence between sentences.
arXiv Detail & Related papers (2020-06-10T02:42:38Z) - Screenplay Summarization Using Latent Narrative Structure [78.45316339164133]
We propose to explicitly incorporate the underlying structure of narratives into general unsupervised and supervised extractive summarization models.
We formalize narrative structure in terms of key narrative events (turning points) and treat it as latent in order to summarize screenplays.
Experimental results on the CSI corpus of TV screenplays, which we augment with scene-level summarization labels, show that latent turning points correlate with important aspects of a CSI episode.
arXiv Detail & Related papers (2020-04-27T11:54:19Z) - Extractive Summarization as Text Matching [123.09816729675838]
This paper creates a paradigm shift with regard to the way we build neural extractive summarization systems.
We formulate the extractive summarization task as a semantic text matching problem.
We have driven the state-of-the-art extractive result on CNN/DailyMail to a new level (44.41 in ROUGE-1)
arXiv Detail & Related papers (2020-04-19T08:27:57Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.