AREDSUM: Adaptive Redundancy-Aware Iterative Sentence Ranking for
Extractive Document Summarization
- URL: http://arxiv.org/abs/2004.06176v2
- Date: Sat, 3 Apr 2021 02:44:01 GMT
- Title: AREDSUM: Adaptive Redundancy-Aware Iterative Sentence Ranking for
Extractive Document Summarization
- Authors: Keping Bi, Rahul Jha, W. Bruce Croft, Asli Celikyilmaz
- Abstract summary: Redundancy-aware extractive summarization systems score the redundancy of the sentences to be included in a summary.
Previous work shows the efficacy of jointly scoring and selecting sentences with neural sequence generation models.
We present two adaptive learning models: AREDSUM-SEQ that jointly considers salience and novelty during sentence selection; and a two-step AREDSUM-CTX that scores salience first, then learns to balance salience and redundancy.
- Score: 46.00136909474304
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Redundancy-aware extractive summarization systems score the redundancy of the
sentences to be included in a summary either jointly with their salience
information or separately as an additional sentence scoring step. Previous work
shows the efficacy of jointly scoring and selecting sentences with neural
sequence generation models. It is, however, not well-understood if the gain is
due to better encoding techniques or better redundancy reduction approaches.
Similarly, the contribution of salience versus diversity components on the
created summary is not studied well. Building on the state-of-the-art encoding
methods for summarization, we present two adaptive learning models: AREDSUM-SEQ
that jointly considers salience and novelty during sentence selection; and a
two-step AREDSUM-CTX that scores salience first, then learns to balance
salience and redundancy, enabling the measurement of the impact of each aspect.
Empirical results on CNN/DailyMail and NYT50 datasets show that by modeling
diversity explicitly in a separate step, AREDSUM-CTX achieves significantly
better performance than AREDSUM-SEQ as well as state-of-the-art extractive
summarization baselines.
Related papers
- Towards Enhancing Coherence in Extractive Summarization: Dataset and Experiments with LLMs [70.15262704746378]
We propose a systematically created human-annotated dataset consisting of coherent summaries for five publicly available datasets and natural language user feedback.
Preliminary experiments with Falcon-40B and Llama-2-13B show significant performance improvements (10% Rouge-L) in terms of producing coherent summaries.
arXiv Detail & Related papers (2024-07-05T20:25:04Z) - Enhancing Coherence of Extractive Summarization with Multitask Learning [40.349019691412465]
This study proposes a multitask learning architecture for extractive summarization with coherence boosting.
The architecture contains an extractive summarizer and coherent discriminator module.
Experiments show that our proposed method significantly improves the proportion of consecutive sentences in the extracted summaries.
arXiv Detail & Related papers (2023-05-22T09:20:58Z) - DiffuSum: Generation Enhanced Extractive Summarization with Diffusion [14.930704950433324]
Extractive summarization aims to form a summary by directly extracting sentences from the source document.
This paper proposes DiffuSum, a novel paradigm for extractive summarization.
Experimental results show that DiffuSum achieves the new state-of-the-art extractive results on CNN/DailyMail with ROUGE scores of $44.83/22.56/40.56$.
arXiv Detail & Related papers (2023-05-02T19:09:16Z) - Correcting Diverse Factual Errors in Abstractive Summarization via
Post-Editing and Language Model Infilling [56.70682379371534]
We show that our approach vastly outperforms prior methods in correcting erroneous summaries.
Our model -- FactEdit -- improves factuality scores by over 11 points on CNN/DM and over 31 points on XSum.
arXiv Detail & Related papers (2022-10-22T07:16:19Z) - COLO: A Contrastive Learning based Re-ranking Framework for One-Stage
Summarization [84.70895015194188]
We propose a Contrastive Learning based re-ranking framework for one-stage summarization called COLO.
COLO boosts the extractive and abstractive results of one-stage systems on CNN/DailyMail benchmark to 44.58 and 46.33 ROUGE-1 score.
arXiv Detail & Related papers (2022-09-29T06:11:21Z) - Bengali Abstractive News Summarization(BANS): A Neural Attention
Approach [0.8793721044482612]
We present a seq2seq based Long Short-Term Memory (LSTM) network model with attention at encoder-decoder.
Our proposed system deploys a local attention-based model that produces a long sequence of words with lucid and human-like generated sentences.
We also prepared a dataset of more than 19k articles and corresponding human-written summaries collected from bangla.bdnews24.com1.
arXiv Detail & Related papers (2020-12-03T08:17:31Z) - Multi-Fact Correction in Abstractive Text Summarization [98.27031108197944]
Span-Fact is a suite of two factual correction models that leverages knowledge learned from question answering models to make corrections in system-generated summaries via span selection.
Our models employ single or multi-masking strategies to either iteratively or auto-regressively replace entities in order to ensure semantic consistency w.r.t. the source text.
Experiments show that our models significantly boost the factual consistency of system-generated summaries without sacrificing summary quality in terms of both automatic metrics and human evaluation.
arXiv Detail & Related papers (2020-10-06T02:51:02Z) - Exploring Explainable Selection to Control Abstractive Summarization [51.74889133688111]
We develop a novel framework that focuses on explainability.
A novel pair-wise matrix captures the sentence interactions, centrality, and attribute scores.
A sentence-deployed attention mechanism in the abstractor ensures the final summary emphasizes the desired content.
arXiv Detail & Related papers (2020-04-24T14:39:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.