Fine-tuning GPT-3 for Russian Text Summarization
- URL: http://arxiv.org/abs/2108.03502v1
- Date: Sat, 7 Aug 2021 19:01:40 GMT
- Title: Fine-tuning GPT-3 for Russian Text Summarization
- Authors: Alexandr Nikolich, Arina Puchkova
- Abstract summary: This paper showcases ruGPT3 ability to summarize texts, fine-tuning it on the corpora of Russian news with their corresponding human-generated summaries.
We evaluate the resulting texts with a set of metrics, showing that our solution can surpass the state-of-the-art model's performance without additional changes in architecture or loss function.
- Score: 77.34726150561087
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Automatic summarization techniques aim to shorten and generalize information
given in the text while preserving its core message and the most relevant
ideas. This task can be approached and treated with a variety of methods,
however, not many attempts have been made to produce solutions specifically for
the Russian language despite existing localizations of the state-of-the-art
models. In this paper, we aim to showcase ruGPT3 ability to summarize texts,
fine-tuning it on the corpora of Russian news with their corresponding
human-generated summaries. Additionally, we employ hyperparameter tuning so
that the model's output becomes less random and more tied to the original text.
We evaluate the resulting texts with a set of metrics, showing that our
solution can surpass the state-of-the-art model's performance without
additional changes in architecture or loss function. Despite being able to
produce sensible summaries, our model still suffers from a number of flaws,
namely, it is prone to altering Named Entities present in the original text
(such as surnames, places, dates), deviating from facts stated in the given
document, and repeating the information in the summary.
Related papers
- TextFormer: A Query-based End-to-End Text Spotter with Mixed Supervision [61.186488081379]
We propose TextFormer, a query-based end-to-end text spotter with Transformer architecture.
TextFormer builds upon an image encoder and a text decoder to learn a joint semantic understanding for multi-task modeling.
It allows for mutual training and optimization of classification, segmentation, and recognition branches, resulting in deeper feature sharing.
arXiv Detail & Related papers (2023-06-06T03:37:41Z) - Factually Consistent Summarization via Reinforcement Learning with
Textual Entailment Feedback [57.816210168909286]
We leverage recent progress on textual entailment models to address this problem for abstractive summarization systems.
We use reinforcement learning with reference-free, textual entailment rewards to optimize for factual consistency.
Our results, according to both automatic metrics and human evaluation, show that our method considerably improves the faithfulness, salience, and conciseness of the generated summaries.
arXiv Detail & Related papers (2023-05-31T21:04:04Z) - On Improving Summarization Factual Consistency from Natural Language
Feedback [35.03102318835244]
We study whether informational feedback in natural language can be leveraged to improve generation quality and user preference alignment.
We collect a high-quality dataset, DeFacto, containing human demonstrations and informational natural language feedback.
We show that DeFacto can provide factually consistent human-edited summaries.
arXiv Detail & Related papers (2022-12-20T02:47:37Z) - Enriching and Controlling Global Semantics for Text Summarization [11.037667460077813]
Transformer-based models have been proven effective in the abstractive summarization task by creating fluent and informative summaries.
We introduce a neural topic model empowered with normalizing flow to capture the global semantics of the document, which are then integrated into the summarization model.
Our method outperforms state-of-the-art summarization models on five common text summarization datasets.
arXiv Detail & Related papers (2021-09-22T09:31:50Z) - ARMAN: Pre-training with Semantically Selecting and Reordering of
Sentences for Persian Abstractive Summarization [7.16879432974126]
We propose ARMAN, a Transformer-based encoder-decoder model pre-trained with three novel objectives to address this issue.
In ARMAN, salient sentences from a document are selected according to a modified semantic score to be masked and form a pseudo summary.
We show that our proposed model achieves state-of-the-art performance on all six summarization tasks measured by ROUGE and BERTScore.
arXiv Detail & Related papers (2021-09-09T08:35:39Z) - The Factual Inconsistency Problem in Abstractive Text Summarization: A
Survey [25.59111855107199]
neural encoder-decoder models pioneered by Seq2Seq framework have been proposed to achieve the goal of generating more abstractive summaries.
At a high level, such neural models can freely generate summaries without any constraint on the words or phrases used.
However, the neural model's abstraction ability is a double-edged sword.
arXiv Detail & Related papers (2021-04-30T08:46:13Z) - TextFlint: Unified Multilingual Robustness Evaluation Toolkit for
Natural Language Processing [73.16475763422446]
We propose a multilingual robustness evaluation platform for NLP tasks (TextFlint)
It incorporates universal text transformation, task-specific transformation, adversarial attack, subpopulation, and their combinations to provide comprehensive robustness analysis.
TextFlint generates complete analytical reports as well as targeted augmented data to address the shortcomings of the model's robustness.
arXiv Detail & Related papers (2021-03-21T17:20:38Z) - Multi-Fact Correction in Abstractive Text Summarization [98.27031108197944]
Span-Fact is a suite of two factual correction models that leverages knowledge learned from question answering models to make corrections in system-generated summaries via span selection.
Our models employ single or multi-masking strategies to either iteratively or auto-regressively replace entities in order to ensure semantic consistency w.r.t. the source text.
Experiments show that our models significantly boost the factual consistency of system-generated summaries without sacrificing summary quality in terms of both automatic metrics and human evaluation.
arXiv Detail & Related papers (2020-10-06T02:51:02Z) - Towards Faithful Neural Table-to-Text Generation with Content-Matching
Constraints [63.84063384518667]
We propose a novel Transformer-based generation framework to achieve the goal.
Core techniques in our method to enforce faithfulness include a new table-text optimal-transport matching loss.
To evaluate faithfulness, we propose a new automatic metric specialized to the table-to-text generation problem.
arXiv Detail & Related papers (2020-05-03T02:54:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.