Abstractive Text Summarization Using the BRIO Training Paradigm
- URL: http://arxiv.org/abs/2305.13696v1
- Date: Tue, 23 May 2023 05:09:53 GMT
- Title: Abstractive Text Summarization Using the BRIO Training Paradigm
- Authors: Khang Nhut Lam and Thieu Gia Doan and Khang Thua Pham and Jugal Kalita
- Abstract summary: This paper presents a technique to improve abstractive summaries by fine-tuning pre-trained language models.
We build a text summarization dataset for Vietnamese, called VieSum.
We perform experiments with abstractive summarization models trained with the BRIO paradigm on the CNNDM and the VieSum datasets.
- Score: 2.102846336724103
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Summary sentences produced by abstractive summarization models may be
coherent and comprehensive, but they lack control and rely heavily on reference
summaries. The BRIO training paradigm assumes a non-deterministic distribution
to reduce the model's dependence on reference summaries, and improve model
performance during inference. This paper presents a straightforward but
effective technique to improve abstractive summaries by fine-tuning pre-trained
language models, and training them with the BRIO paradigm. We build a text
summarization dataset for Vietnamese, called VieSum. We perform experiments
with abstractive summarization models trained with the BRIO paradigm on the
CNNDM and the VieSum datasets. The results show that the models, trained on
basic hardware, outperform all existing abstractive summarization models,
especially for Vietnamese.
Related papers
- From News to Summaries: Building a Hungarian Corpus for Extractive and Abstractive Summarization [0.19107347888374507]
HunSum-2 is an open-source Hungarian corpus suitable for training abstractive and extractive summarization models.
The dataset is assembled from segments of the Common Crawl corpus undergoing thorough cleaning.
arXiv Detail & Related papers (2024-04-04T16:07:06Z) - Information-Theoretic Distillation for Reference-less Summarization [67.51150817011617]
We present a novel framework to distill a powerful summarizer based on the information-theoretic objective for summarization.
We start off from Pythia-2.8B as the teacher model, which is not yet capable of summarization.
We arrive at a compact but powerful summarizer with only 568M parameters that performs competitively against ChatGPT.
arXiv Detail & Related papers (2024-03-20T17:42:08Z) - How Ready are Pre-trained Abstractive Models and LLMs for Legal Case
Judgement Summarization? [4.721618284417204]
In recent years, abstractive summarization models are gaining popularity.
Legal domain-specific pre-trained abstractive summarization models are now available.
General-domain pre-trained Large Language Models (LLMs) are known to generate high-quality text.
arXiv Detail & Related papers (2023-06-02T03:16:19Z) - Abstractive Summary Generation for the Urdu Language [1.9594639581421422]
We employ a transformer-based model that utilizes self-attention mechanisms to encode the input text and generate a summary.
Our experiments show that our model can produce summaries that are grammatically correct and semantically meaningful.
arXiv Detail & Related papers (2023-05-25T15:55:42Z) - Inverse Reinforcement Learning for Text Summarization [52.765898203824975]
We introduce inverse reinforcement learning (IRL) as an effective paradigm for training abstractive summarization models.
Experimental results across datasets in different domains demonstrate the superiority of our proposed IRL model for summarization over MLE and RL baselines.
arXiv Detail & Related papers (2022-12-19T23:45:05Z) - Correcting Diverse Factual Errors in Abstractive Summarization via
Post-Editing and Language Model Infilling [56.70682379371534]
We show that our approach vastly outperforms prior methods in correcting erroneous summaries.
Our model -- FactEdit -- improves factuality scores by over 11 points on CNN/DM and over 31 points on XSum.
arXiv Detail & Related papers (2022-10-22T07:16:19Z) - COLO: A Contrastive Learning based Re-ranking Framework for One-Stage
Summarization [84.70895015194188]
We propose a Contrastive Learning based re-ranking framework for one-stage summarization called COLO.
COLO boosts the extractive and abstractive results of one-stage systems on CNN/DailyMail benchmark to 44.58 and 46.33 ROUGE-1 score.
arXiv Detail & Related papers (2022-09-29T06:11:21Z) - Dialogue Summarization with Supporting Utterance Flow Modeling and Fact
Regularization [58.965859508695225]
We propose an end-to-end neural model for dialogue summarization with two novel modules.
The supporting utterance flow modeling helps to generate a coherent summary by smoothly shifting the focus from the former utterances to the later ones.
The fact regularization encourages the generated summary to be factually consistent with the ground-truth summary during model training.
arXiv Detail & Related papers (2021-08-03T03:09:25Z) - Liputan6: A Large-scale Indonesian Dataset for Text Summarization [43.375797352517765]
We harvest articles from Liputan6.com, an online news portal, and obtain 215,827 document-summary pairs.
We leverage pre-trained language models to develop benchmark extractive and abstractive summarization methods over the dataset.
arXiv Detail & Related papers (2020-11-02T02:01:12Z) - Multi-Fact Correction in Abstractive Text Summarization [98.27031108197944]
Span-Fact is a suite of two factual correction models that leverages knowledge learned from question answering models to make corrections in system-generated summaries via span selection.
Our models employ single or multi-masking strategies to either iteratively or auto-regressively replace entities in order to ensure semantic consistency w.r.t. the source text.
Experiments show that our models significantly boost the factual consistency of system-generated summaries without sacrificing summary quality in terms of both automatic metrics and human evaluation.
arXiv Detail & Related papers (2020-10-06T02:51:02Z) - Learning by Semantic Similarity Makes Abstractive Summarization Better [13.324006587838522]
We compare the generated summaries from recent LM, BART, and the reference summaries from a benchmark dataset, CNN/DM.
Interestingly, model-generated summaries receive higher scores relative to reference summaries.
arXiv Detail & Related papers (2020-02-18T17:59:02Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.