AraBART: a Pretrained Arabic Sequence-to-Sequence Model for Abstractive
Summarization
- URL: http://arxiv.org/abs/2203.10945v1
- Date: Mon, 21 Mar 2022 13:11:41 GMT
- Title: AraBART: a Pretrained Arabic Sequence-to-Sequence Model for Abstractive
Summarization
- Authors: Moussa Kamal Eddine, Nadi Tomeh, Nizar Habash, Joseph Le Roux,
Michalis Vazirgiannis
- Abstract summary: We propose AraBART, the first Arabic model in which the encoder and the decoder are pretrained end-to-end, based on BART.
We show that AraBART achieves the best performance on multiple abstractive summarization datasets.
- Score: 23.540743628126837
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: Like most natural language understanding and generation tasks,
state-of-the-art models for summarization are transformer-based
sequence-to-sequence architectures that are pretrained on large corpora. While
most existing models focused on English, Arabic remained understudied. In this
paper we propose AraBART, the first Arabic model in which the encoder and the
decoder are pretrained end-to-end, based on BART. We show that AraBART achieves
the best performance on multiple abstractive summarization datasets,
outperforming strong baselines including a pretrained Arabic BERT-based model
and multilingual mBART and mT5 models.
Related papers
- VBART: The Turkish LLM [0.0]
VBART is the first Turkish sequence-to-sequence Large Language Models pre-trained on a large corpus from scratch.
Fine-tuned VBART models surpass the prior state-of-the-art results in abstractive text summarization, title generation, text paraphrasing, question answering and question generation tasks.
arXiv Detail & Related papers (2024-03-02T20:40:11Z) - On the importance of Data Scale in Pretraining Arabic Language Models [46.431706010614334]
We conduct a comprehensive study on the role of data in Arabic Pretrained Language Models (PLMs)
We reassess the performance of a suite of state-of-the-art Arabic PLMs by retraining them on massive-scale, high-quality Arabic corpora.
Our analysis strongly suggests that pretraining data by far is the primary contributor to performance, surpassing other factors.
arXiv Detail & Related papers (2024-01-15T15:11:15Z) - Sequence-to-Sequence Spanish Pre-trained Language Models [23.084770129038215]
This paper introduces the implementation and evaluation of renowned encoder-decoder architectures exclusively pre-trained on Spanish corpora.
We present Spanish versions of BART, T5, and BERT2BERT-style models and subject them to a comprehensive assessment across various sequence-to-sequence tasks.
Our findings underscore the competitive performance of all models, with the BART- and T5-based models emerging as top performers across all tasks.
arXiv Detail & Related papers (2023-09-20T12:35:19Z) - Cross-Lingual NER for Financial Transaction Data in Low-Resource
Languages [70.25418443146435]
We propose an efficient modeling framework for cross-lingual named entity recognition in semi-structured text data.
We employ two independent datasets of SMSs in English and Arabic, each carrying semi-structured banking transaction information.
With access to only 30 labeled samples, our model can generalize the recognition of merchants, amounts, and other fields from English to Arabic.
arXiv Detail & Related papers (2023-07-16T00:45:42Z) - Teaching the Pre-trained Model to Generate Simple Texts for Text
Simplification [59.625179404482594]
Randomly masking text spans in ordinary texts in the pre-training stage hardly allows models to acquire the ability to generate simple texts.
We propose a new continued pre-training strategy to teach the pre-trained model to generate simple texts.
arXiv Detail & Related papers (2023-05-21T14:03:49Z) - Masked Autoencoders As The Unified Learners For Pre-Trained Sentence
Representation [77.47617360812023]
We extend the recently proposed MAE style pre-training strategy, RetroMAE, to support a wide variety of sentence representation tasks.
The first stage performs RetroMAE over generic corpora, like Wikipedia, BookCorpus, etc., from which the base model is learned.
The second stage takes place on domain-specific data, e.g., MS MARCO and NLI, where the base model is continuingly trained based on RetroMAE and contrastive learning.
arXiv Detail & Related papers (2022-07-30T14:34:55Z) - Unsupervised Paraphrasing with Pretrained Language Models [85.03373221588707]
We propose a training pipeline that enables pre-trained language models to generate high-quality paraphrases in an unsupervised setting.
Our recipe consists of task-adaptation, self-supervision, and a novel decoding algorithm named Dynamic Blocking.
We show with automatic and human evaluations that our approach achieves state-of-the-art performance on both the Quora Question Pair and the ParaNMT datasets.
arXiv Detail & Related papers (2020-10-24T11:55:28Z) - ParsBERT: Transformer-based Model for Persian Language Understanding [0.7646713951724012]
This paper proposes a monolingual BERT for the Persian language (ParsBERT)
It shows its state-of-the-art performance compared to other architectures and multilingual models.
ParsBERT obtains higher scores in all datasets, including existing ones as well as composed ones.
arXiv Detail & Related papers (2020-05-26T05:05:32Z) - BERT Fine-tuning For Arabic Text Summarization [0.0]
Our model works with multilingual BERT (as Arabic language does not have a pretrained BERT of its own)
We show its performance in English corpus first before applying it to Arabic corpora in both extractive and abstractive tasks.
arXiv Detail & Related papers (2020-03-29T20:23:14Z) - Multilingual Denoising Pre-training for Neural Machine Translation [132.66750663226287]
mBART is a sequence-to-sequence denoising auto-encoder pre-trained on large-scale monolingual corpora.
mBART is one of the first methods for pre-training a complete sequence-to-sequence model.
arXiv Detail & Related papers (2020-01-22T18:59:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.