Related papers: Fine-tuning GPT-3 for Russian Text Summarization

Fine-tuning GPT-3 for Russian Text Summarization

URL: http://arxiv.org/abs/2108.03502v1
Date: Sat, 7 Aug 2021 19:01:40 GMT
Title: Fine-tuning GPT-3 for Russian Text Summarization
Authors: Alexandr Nikolich, Arina Puchkova
Abstract summary: This paper showcases ruGPT3 ability to summarize texts, fine-tuning it on the corpora of Russian news with their corresponding human-generated summaries. We evaluate the resulting texts with a set of metrics, showing that our solution can surpass the state-of-the-art model's performance without additional changes in architecture or loss function.
Score: 77.34726150561087
License: http://creativecommons.org/licenses/by/4.0/
Abstract: Automatic summarization techniques aim to shorten and generalize information given in the text while preserving its core message and the most relevant ideas. This task can be approached and treated with a variety of methods, however, not many attempts have been made to produce solutions specifically for the Russian language despite existing localizations of the state-of-the-art models. In this paper, we aim to showcase ruGPT3 ability to summarize texts, fine-tuning it on the corpora of Russian news with their corresponding human-generated summaries. Additionally, we employ hyperparameter tuning so that the model's output becomes less random and more tied to the original text. We evaluate the resulting texts with a set of metrics, showing that our solution can surpass the state-of-the-art model's performance without additional changes in architecture or loss function. Despite being able to produce sensible summaries, our model still suffers from a number of flaws, namely, it is prone to altering Named Entities present in the original text (such as surnames, places, dates), deviating from facts stated in the given document, and repeating the information in the summary.

Related papers

Survey on Abstractive Text Summarization: Dataset, Models, and Metrics [0.8184895397419141]
Transformer models are distinguished by their attention mechanisms, pretraining on general knowledge, and fine-tuning for downstream tasks. This survey examines the state of the art in text summarization models, with a specific focus on the abstractive summarization approach.
arXiv Detail & Related papers (2024-12-22T21:18:40Z)
Detecting Document-level Paraphrased Machine Generated Content: Mimicking Human Writing Style and Involving Discourse Features [57.34477506004105]
Machine-generated content poses challenges such as academic plagiarism and the spread of misinformation. We introduce novel methodologies and datasets to overcome these challenges. We propose MhBART, an encoder-decoder model designed to emulate human writing style. We also propose DTransformer, a model that integrates discourse analysis through PDTB preprocessing to encode structural features.
arXiv Detail & Related papers (2024-12-17T08:47:41Z)
TextFormer: A Query-based End-to-End Text Spotter with Mixed Supervision [61.186488081379]
We propose TextFormer, a query-based end-to-end text spotter with Transformer architecture. TextFormer builds upon an image encoder and a text decoder to learn a joint semantic understanding for multi-task modeling. It allows for mutual training and optimization of classification, segmentation, and recognition branches, resulting in deeper feature sharing.
arXiv Detail & Related papers (2023-06-06T03:37:41Z)
Factually Consistent Summarization via Reinforcement Learning with Textual Entailment Feedback [57.816210168909286]
We leverage recent progress on textual entailment models to address this problem for abstractive summarization systems. We use reinforcement learning with reference-free, textual entailment rewards to optimize for factual consistency. Our results, according to both automatic metrics and human evaluation, show that our method considerably improves the faithfulness, salience, and conciseness of the generated summaries.
arXiv Detail & Related papers (2023-05-31T21:04:04Z)
On Improving Summarization Factual Consistency from Natural Language Feedback [35.03102318835244]
We study whether informational feedback in natural language can be leveraged to improve generation quality and user preference alignment. We collect a high-quality dataset, DeFacto, containing human demonstrations and informational natural language feedback. We show that DeFacto can provide factually consistent human-edited summaries.
arXiv Detail & Related papers (2022-12-20T02:47:37Z)
Enriching and Controlling Global Semantics for Text Summarization [11.037667460077813]
Transformer-based models have been proven effective in the abstractive summarization task by creating fluent and informative summaries. We introduce a neural topic model empowered with normalizing flow to capture the global semantics of the document, which are then integrated into the summarization model. Our method outperforms state-of-the-art summarization models on five common text summarization datasets.
arXiv Detail & Related papers (2021-09-22T09:31:50Z)
ARMAN: Pre-training with Semantically Selecting and Reordering of Sentences for Persian Abstractive Summarization [7.16879432974126]
We propose ARMAN, a Transformer-based encoder-decoder model pre-trained with three novel objectives to address this issue. In ARMAN, salient sentences from a document are selected according to a modified semantic score to be masked and form a pseudo summary. We show that our proposed model achieves state-of-the-art performance on all six summarization tasks measured by ROUGE and BERTScore.
arXiv Detail & Related papers (2021-09-09T08:35:39Z)
The Factual Inconsistency Problem in Abstractive Text Summarization: A Survey [25.59111855107199]
neural encoder-decoder models pioneered by Seq2Seq framework have been proposed to achieve the goal of generating more abstractive summaries. At a high level, such neural models can freely generate summaries without any constraint on the words or phrases used. However, the neural model's abstraction ability is a double-edged sword.
arXiv Detail & Related papers (2021-04-30T08:46:13Z)
TextFlint: Unified Multilingual Robustness Evaluation Toolkit for Natural Language Processing [73.16475763422446]
We propose a multilingual robustness evaluation platform for NLP tasks (TextFlint) It incorporates universal text transformation, task-specific transformation, adversarial attack, subpopulation, and their combinations to provide comprehensive robustness analysis. TextFlint generates complete analytical reports as well as targeted augmented data to address the shortcomings of the model's robustness.
arXiv Detail & Related papers (2021-03-21T17:20:38Z)
Multi-Fact Correction in Abstractive Text Summarization [98.27031108197944]
Span-Fact is a suite of two factual correction models that leverages knowledge learned from question answering models to make corrections in system-generated summaries via span selection. Our models employ single or multi-masking strategies to either iteratively or auto-regressively replace entities in order to ensure semantic consistency w.r.t. the source text. Experiments show that our models significantly boost the factual consistency of system-generated summaries without sacrificing summary quality in terms of both automatic metrics and human evaluation.
arXiv Detail & Related papers (2020-10-06T02:51:02Z)
Towards Faithful Neural Table-to-Text Generation with Content-Matching Constraints [63.84063384518667]
We propose a novel Transformer-based generation framework to achieve the goal. Core techniques in our method to enforce faithfulness include a new table-text optimal-transport matching loss. To evaluate faithfulness, we propose a new automatic metric specialized to the table-to-text generation problem.
arXiv Detail & Related papers (2020-05-03T02:54:26Z)

This list is automatically generated from the titles and abstracts of the papers in this site.