Combination of abstractive and extractive approaches for summarization
of long scientific texts
- URL: http://arxiv.org/abs/2006.05354v2
- Date: Fri, 12 Jun 2020 11:25:21 GMT
- Title: Combination of abstractive and extractive approaches for summarization
of long scientific texts
- Authors: Vladislav Tretyak, Denis Stepanov
- Abstract summary: We present a method to generate summaries of long scientific documents using both extractive and abstractive approaches.
Our experiments showed that using extractive and abstractive models jointly significantly improves summarization results and ROUGE scores.
- Score: 0.0
- License: http://creativecommons.org/publicdomain/zero/1.0/
- Abstract: In this research work, we present a method to generate summaries of long
scientific documents that uses the advantages of both extractive and
abstractive approaches. Before producing a summary in an abstractive manner, we
perform the extractive step, which then is used for conditioning the abstractor
module. We used pre-trained transformer-based language models, for both
extractor and abstractor. Our experiments showed that using extractive and
abstractive models jointly significantly improves summarization results and
ROUGE scores.
Related papers
- Towards Enhancing Coherence in Extractive Summarization: Dataset and Experiments with LLMs [70.15262704746378]
We propose a systematically created human-annotated dataset consisting of coherent summaries for five publicly available datasets and natural language user feedback.
Preliminary experiments with Falcon-40B and Llama-2-13B show significant performance improvements (10% Rouge-L) in terms of producing coherent summaries.
arXiv Detail & Related papers (2024-07-05T20:25:04Z) - Improving Factuality of Abstractive Summarization via Contrastive Reward
Learning [77.07192378869776]
We propose a simple but effective contrastive learning framework that incorporates recent developments in reward learning and factuality metrics.
Empirical studies demonstrate that the proposed framework enables summarization models to learn from feedback of factuality metrics.
arXiv Detail & Related papers (2023-07-10T12:01:18Z) - Salience Allocation as Guidance for Abstractive Summarization [61.31826412150143]
We propose a novel summarization approach with a flexible and reliable salience guidance, namely SEASON (SaliencE Allocation as Guidance for Abstractive SummarizatiON)
SEASON utilizes the allocation of salience expectation to guide abstractive summarization and adapts well to articles in different abstractiveness.
arXiv Detail & Related papers (2022-10-22T02:13:44Z) - Improving Multi-Document Summarization through Referenced Flexible
Extraction with Credit-Awareness [21.037841262371355]
A notable challenge in Multi-Document Summarization (MDS) is the extremely-long length of the input.
We present an extract-then-abstract Transformer framework to overcome the problem.
We propose a loss weighting mechanism that makes the model aware of the unequal importance for the sentences not in the pseudo extraction oracle.
arXiv Detail & Related papers (2022-05-04T04:40:39Z) - Mitigating Data Scarceness through Data Synthesis, Augmentation and
Curriculum for Abstractive Summarization [0.685316573653194]
We introduce a method of data synthesis with paraphrasing, a data augmentation technique with sample mixing, and curriculum learning with two new difficulty metrics based on specificity and abstractiveness.
We conduct experiments to show that these three techniques can help improve abstractive summarization across two summarization models and two different datasets.
arXiv Detail & Related papers (2021-09-17T14:31:08Z) - To Point or Not to Point: Understanding How Abstractive Summarizers
Paraphrase Text [4.4044968357361745]
We characterize how one popular abstractive model, the pointer-generator model of See et al., uses its explicit copy/generation switch to control its level of abstraction.
When we modify the copy/generation switch and force the model to generate, only simple neural abilities are revealed alongside factual inaccuracies and hallucinations.
In line with previous research, these results suggest that abstractive summarization models lack the semantic understanding necessary to generate paraphrases that are both abstractive and faithful to the source document.
arXiv Detail & Related papers (2021-06-03T04:03:15Z) - EASE: Extractive-Abstractive Summarization with Explanations [18.046254486733186]
We present an explainable summarization system based on the Information Bottleneck principle.
Inspired by previous research that humans use a two-stage framework to summarize long documents, our framework first extracts a pre-defined amount of evidence spans as explanations.
We show that explanations from our framework are more relevant than simple baselines, without substantially sacrificing the quality of the generated summary.
arXiv Detail & Related papers (2021-05-14T17:45:06Z) - Constrained Abstractive Summarization: Preserving Factual Consistency
with Constrained Generation [93.87095877617968]
We propose Constrained Abstractive Summarization (CAS), a general setup that preserves the factual consistency of abstractive summarization.
We adopt lexically constrained decoding, a technique generally applicable to autoregressive generative models, to fulfill CAS.
We observe up to 13.8 ROUGE-2 gains when only one manual constraint is used in interactive summarization.
arXiv Detail & Related papers (2020-10-24T00:27:44Z) - Topic-Guided Abstractive Text Summarization: a Joint Learning Approach [19.623946402970933]
We introduce a new approach for abstractive text summarization, Topic-Guided Abstractive Summarization.
The idea is to incorporate neural topic modeling with a Transformer-based sequence-to-sequence (seq2seq) model in a joint learning framework.
arXiv Detail & Related papers (2020-10-20T14:45:25Z) - Multi-Fact Correction in Abstractive Text Summarization [98.27031108197944]
Span-Fact is a suite of two factual correction models that leverages knowledge learned from question answering models to make corrections in system-generated summaries via span selection.
Our models employ single or multi-masking strategies to either iteratively or auto-regressively replace entities in order to ensure semantic consistency w.r.t. the source text.
Experiments show that our models significantly boost the factual consistency of system-generated summaries without sacrificing summary quality in terms of both automatic metrics and human evaluation.
arXiv Detail & Related papers (2020-10-06T02:51:02Z) - At Which Level Should We Extract? An Empirical Analysis on Extractive
Document Summarization [110.54963847339775]
We show that unnecessity and redundancy issues exist when extracting full sentences.
We propose extracting sub-sentential units based on the constituency parsing tree.
arXiv Detail & Related papers (2020-04-06T13:35:10Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.