Leveraging ParsBERT and Pretrained mT5 for Persian Abstractive Text
Summarization
- URL: http://arxiv.org/abs/2012.11204v1
- Date: Mon, 21 Dec 2020 09:35:52 GMT
- Title: Leveraging ParsBERT and Pretrained mT5 for Persian Abstractive Text
Summarization
- Authors: Mehrdad Farahani, Mohammad Gharachorloo, Mohammad Manthouri
- Abstract summary: This paper introduces a novel dataset named pn-summary for Persian abstractive text summarization.
The models employed in this paper are mT5 and an encoder-decoder version of the ParsBERT model.
- Score: 1.0742675209112622
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Text summarization is one of the most critical Natural Language Processing
(NLP) tasks. More and more researches are conducted in this field every day.
Pre-trained transformer-based encoder-decoder models have begun to gain
popularity for these tasks. This paper proposes two methods to address this
task and introduces a novel dataset named pn-summary for Persian abstractive
text summarization. The models employed in this paper are mT5 and an
encoder-decoder version of the ParsBERT model (i.e., a monolingual BERT model
for Persian). These models are fine-tuned on the pn-summary dataset. The
current work is the first of its kind and, by achieving promising results, can
serve as a baseline for any future work.
Related papers
- Text Summarization Using Large Language Models: A Comparative Study of
MPT-7b-instruct, Falcon-7b-instruct, and OpenAI Chat-GPT Models [0.0]
Leveraging Large Language Models (LLMs) has shown remarkable promise in enhancing summarization techniques.
This paper embarks on an exploration of text summarization with a diverse set of LLMs, including MPT-7b-instruct, falcon-7b-instruct, and OpenAI ChatGPT text-davinci-003 models.
arXiv Detail & Related papers (2023-10-16T14:33:02Z) - Abstractive Text Summarization for Resumes With Cutting Edge NLP
Transformers and LSTM [0.0]
LSTM, pre-trained models, and fine-tuned models were assessed using a dataset of resumes.
The BART-Large model fine-tuned with the resume dataset gave the best performance.
arXiv Detail & Related papers (2023-06-23T06:33:20Z) - Z-Code++: A Pre-trained Language Model Optimized for Abstractive
Summarization [108.09419317477986]
Z-Code++ is a new pre-trained language model optimized for abstractive text summarization.
The model is first pre-trained using text corpora for language understanding, and then is continually pre-trained on summarization corpora for grounded text generation.
Our model is parameter-efficient in that it outperforms the 600x larger PaLM-540B on XSum, and the finetuned 200x larger GPT3-175B on SAMSum.
arXiv Detail & Related papers (2022-08-21T01:00:54Z) - Evaluation of Transfer Learning for Polish with a Text-to-Text Model [54.81823151748415]
We introduce a new benchmark for assessing the quality of text-to-text models for Polish.
The benchmark consists of diverse tasks and datasets: KLEJ benchmark adapted for text-to-text, en-pl translation, summarization, and question answering.
We present plT5 - a general-purpose text-to-text model for Polish that can be fine-tuned on various Natural Language Processing (NLP) tasks with a single training objective.
arXiv Detail & Related papers (2022-05-18T09:17:14Z) - HETFORMER: Heterogeneous Transformer with Sparse Attention for Long-Text
Extractive Summarization [57.798070356553936]
HETFORMER is a Transformer-based pre-trained model with multi-granularity sparse attentions for extractive summarization.
Experiments on both single- and multi-document summarization tasks show that HETFORMER achieves state-of-the-art performance in Rouge F1.
arXiv Detail & Related papers (2021-10-12T22:42:31Z) - A Survey of Recent Abstract Summarization Techniques [0.0]
We investigate the impact of pre-training models on several Wikipedia datasets in English and Indonesian language.
The most significant factors that influence ROUGE performance are coverage, density, and compression.
The T5-Large, the Pegasus-XSum, and the ProphetNet-CNNDM provide the best summarization.
arXiv Detail & Related papers (2021-04-15T20:01:34Z) - Data-to-Text Generation with Iterative Text Editing [3.42658286826597]
We present a novel approach to data-to-text generation based on iterative text editing.
We first transform data items to text using trivial templates, and then we iteratively improve the resulting text by a neural model trained for the sentence fusion task.
The output of the model is filtered by a simple and reranked with an off-the-shelf pre-trained language model.
arXiv Detail & Related papers (2020-11-03T13:32:38Z) - Improving Text Generation with Student-Forcing Optimal Transport [122.11881937642401]
We propose using optimal transport (OT) to match the sequences generated in training and testing modes.
An extension is also proposed to improve the OT learning, based on the structural and contextual information of the text sequences.
The effectiveness of the proposed method is validated on machine translation, text summarization, and text generation tasks.
arXiv Detail & Related papers (2020-10-12T19:42:25Z) - POINTER: Constrained Progressive Text Generation via Insertion-based
Generative Pre-training [93.79766670391618]
We present POINTER, a novel insertion-based approach for hard-constrained text generation.
The proposed method operates by progressively inserting new tokens between existing tokens in a parallel manner.
The resulting coarse-to-fine hierarchy makes the generation process intuitive and interpretable.
arXiv Detail & Related papers (2020-05-01T18:11:54Z) - Pre-training for Abstractive Document Summarization by Reinstating
Source Text [105.77348528847337]
This paper presents three pre-training objectives which allow us to pre-train a Seq2Seq based abstractive summarization model on unlabeled text.
Experiments on two benchmark summarization datasets show that all three objectives can improve performance upon baselines.
arXiv Detail & Related papers (2020-04-04T05:06:26Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.