What Have We Achieved on Text Summarization?
- URL: http://arxiv.org/abs/2010.04529v1
- Date: Fri, 9 Oct 2020 12:39:33 GMT
- Title: What Have We Achieved on Text Summarization?
- Authors: Dandan Huang, Leyang Cui, Sen Yang, Guangsheng Bao, Kun Wang, Jun Xie,
Yue Zhang
- Abstract summary: We aim to gain more understanding of summarization systems with respect to their strengths and limits on a fine-grained syntactic and semantic level.
We quantify 8 major sources of errors on 10 representative summarization models manually.
- Score: 32.90169694110989
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Deep learning has led to significant improvement in text summarization with
various methods investigated and improved ROUGE scores reported over the years.
However, gaps still exist between summaries produced by automatic summarizers
and human professionals. Aiming to gain more understanding of summarization
systems with respect to their strengths and limits on a fine-grained syntactic
and semantic level, we consult the Multidimensional Quality Metric(MQM) and
quantify 8 major sources of errors on 10 representative summarization models
manually. Primarily, we find that 1) under similar settings, extractive
summarizers are in general better than their abstractive counterparts thanks to
strength in faithfulness and factual-consistency; 2) milestone techniques such
as copy, coverage and hybrid extractive/abstractive methods do bring specific
improvements but also demonstrate limitations; 3) pre-training techniques, and
in particular sequence-to-sequence pre-training, are highly effective for
improving text summarization, with BART giving the best results.
Related papers
- Rank Your Summaries: Enhancing Bengali Text Summarization via
Ranking-based Approach [0.0]
This paper aims to identify the most accurate and informative summary for a given text by utilizing a simple but effective ranking-based approach.
We utilize four pre-trained summarization models to generate summaries, followed by applying a text ranking algorithm to identify the most suitable summary.
Experimental results suggest that by leveraging the strengths of each pre-trained transformer model, our methodology significantly improves the accuracy and effectiveness of the Bengali text summarization.
arXiv Detail & Related papers (2023-07-14T15:07:20Z) - Factually Consistent Summarization via Reinforcement Learning with
Textual Entailment Feedback [57.816210168909286]
We leverage recent progress on textual entailment models to address this problem for abstractive summarization systems.
We use reinforcement learning with reference-free, textual entailment rewards to optimize for factual consistency.
Our results, according to both automatic metrics and human evaluation, show that our method considerably improves the faithfulness, salience, and conciseness of the generated summaries.
arXiv Detail & Related papers (2023-05-31T21:04:04Z) - Improving Factuality of Abstractive Summarization without Sacrificing
Summary Quality [27.57037141986362]
We propose EFACTSUM (i.e., Effective Factual Summarization) to improve summary factuality without sacrificing summary quality.
We show that using a contrastive learning framework with our refined candidate summaries leads to significant gains on both factuality and similarity-based metrics.
arXiv Detail & Related papers (2023-05-24T10:15:17Z) - Generating Multiple-Length Summaries via Reinforcement Learning for
Unsupervised Sentence Summarization [44.835811239393244]
Sentence summarization shortens given texts while maintaining core contents of the texts.
Unsupervised approaches have been studied to summarize texts without human-written summaries.
We devise an abstractive model based on reinforcement learning without ground-truth summaries.
arXiv Detail & Related papers (2022-12-21T08:34:28Z) - Human-in-the-loop Abstractive Dialogue Summarization [61.4108097664697]
We propose to incorporate different levels of human feedback into the training process.
This will enable us to guide the models to capture the behaviors humans care about for summaries.
arXiv Detail & Related papers (2022-12-19T19:11:27Z) - Salience Allocation as Guidance for Abstractive Summarization [61.31826412150143]
We propose a novel summarization approach with a flexible and reliable salience guidance, namely SEASON (SaliencE Allocation as Guidance for Abstractive SummarizatiON)
SEASON utilizes the allocation of salience expectation to guide abstractive summarization and adapts well to articles in different abstractiveness.
arXiv Detail & Related papers (2022-10-22T02:13:44Z) - Meta-Transfer Learning for Low-Resource Abstractive Summarization [12.757403709439325]
Low-Resource Abstractive Summarization aims to leverage past experience to improve the performance with limited labeled examples of target corpus.
We conduct extensive experiments on various summarization corpora with different writing styles and forms.
The results demonstrate that our approach achieves the state-of-the-art on 6 corpora in low-resource scenarios, with only 0.7% of trainable parameters compared to previous work.
arXiv Detail & Related papers (2021-02-18T14:42:09Z) - Constrained Abstractive Summarization: Preserving Factual Consistency
with Constrained Generation [93.87095877617968]
We propose Constrained Abstractive Summarization (CAS), a general setup that preserves the factual consistency of abstractive summarization.
We adopt lexically constrained decoding, a technique generally applicable to autoregressive generative models, to fulfill CAS.
We observe up to 13.8 ROUGE-2 gains when only one manual constraint is used in interactive summarization.
arXiv Detail & Related papers (2020-10-24T00:27:44Z) - SupMMD: A Sentence Importance Model for Extractive Summarization using
Maximum Mean Discrepancy [92.5683788430012]
SupMMD is a novel technique for generic and update summarization based on the maximum discrepancy from kernel two-sample testing.
We show the efficacy of SupMMD in both generic and update summarization tasks by meeting or exceeding the current state-of-the-art on the DUC-2004 and TAC-2009 datasets.
arXiv Detail & Related papers (2020-10-06T09:26:55Z) - Multi-Fact Correction in Abstractive Text Summarization [98.27031108197944]
Span-Fact is a suite of two factual correction models that leverages knowledge learned from question answering models to make corrections in system-generated summaries via span selection.
Our models employ single or multi-masking strategies to either iteratively or auto-regressively replace entities in order to ensure semantic consistency w.r.t. the source text.
Experiments show that our models significantly boost the factual consistency of system-generated summaries without sacrificing summary quality in terms of both automatic metrics and human evaluation.
arXiv Detail & Related papers (2020-10-06T02:51:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.