BARTpho: Pre-trained Sequence-to-Sequence Models for Vietnamese
- URL: http://arxiv.org/abs/2109.09701v1
- Date: Mon, 20 Sep 2021 17:14:22 GMT
- Title: BARTpho: Pre-trained Sequence-to-Sequence Models for Vietnamese
- Authors: Nguyen Luong Tran, Duong Minh Le and Dat Quoc Nguyen
- Abstract summary: We present BARTpho, the first public large-scale monolingual sequence-to-sequence models pre-trained for Vietnamese.
Our BARTpho uses the "large" architecture and pre-training scheme of the sequence-to-sequence denoising model BART.
Experiments on a downstream task of Vietnamese text summarization show that our BARTpho outperforms the strong baseline mBART.
- Score: 5.955739135932037
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We present BARTpho with two versions -- BARTpho_word and BARTpho_syllable --
the first public large-scale monolingual sequence-to-sequence models
pre-trained for Vietnamese. Our BARTpho uses the "large" architecture and
pre-training scheme of the sequence-to-sequence denoising model BART, thus
especially suitable for generative NLP tasks. Experiments on a downstream task
of Vietnamese text summarization show that in both automatic and human
evaluations, our BARTpho outperforms the strong baseline mBART and improves the
state-of-the-art. We release BARTpho to facilitate future research and
applications of generative Vietnamese NLP tasks. Our BARTpho models are
available at: https://github.com/VinAIResearch/BARTpho
Related papers
- Empowering Backbone Models for Visual Text Generation with Input Granularity Control and Glyph-Aware Training [68.41837295318152]
Diffusion-based text-to-image models have demonstrated impressive achievements in diversity and aesthetics but struggle to generate images with visual texts.
Existing backbone models have limitations such as misspelling, failing to generate texts, and lack of support for Chinese text.
We propose a series of methods, aiming to empower backbone models to generate visual texts in English and Chinese.
arXiv Detail & Related papers (2024-10-06T10:25:39Z) - BARTPhoBEiT: Pre-trained Sequence-to-Sequence and Image Transformers
Models for Vietnamese Visual Question Answering [3.0938904602244355]
Visual Question Answering (VQA) is an intricate and demanding task that integrates natural language processing (NLP) and computer vision (CV)
We introduce a transformer-based Vietnamese model named BARTPhoBEiT.
This model includes pre-trained Sequence-to-Sequence and bidirectional encoder representation from Image Transformers in Vietnamese and evaluates Vietnamese VQA datasets.
arXiv Detail & Related papers (2023-07-28T06:23:32Z) - Teaching the Pre-trained Model to Generate Simple Texts for Text
Simplification [59.625179404482594]
Randomly masking text spans in ordinary texts in the pre-training stage hardly allows models to acquire the ability to generate simple texts.
We propose a new continued pre-training strategy to teach the pre-trained model to generate simple texts.
arXiv Detail & Related papers (2023-05-21T14:03:49Z) - ViDeBERTa: A powerful pre-trained language model for Vietnamese [10.000783498978604]
This paper presents ViDeBERTa, a new pre-trained monolingual language model for Vietnamese.
Three versions - ViDeBERTa_xsmall, ViDeBERTa_base, and ViDeBERTa_large - are pre-trained on a large-scale corpus of high-quality and diverse Vietnamese texts.
We fine-tune and evaluate our model on three important natural language downstream tasks, Part-of-speech tagging, Named-entity recognition, and Question answering.
arXiv Detail & Related papers (2023-01-25T07:26:54Z) - Masked Autoencoders As The Unified Learners For Pre-Trained Sentence
Representation [77.47617360812023]
We extend the recently proposed MAE style pre-training strategy, RetroMAE, to support a wide variety of sentence representation tasks.
The first stage performs RetroMAE over generic corpora, like Wikipedia, BookCorpus, etc., from which the base model is learned.
The second stage takes place on domain-specific data, e.g., MS MARCO and NLI, where the base model is continuingly trained based on RetroMAE and contrastive learning.
arXiv Detail & Related papers (2022-07-30T14:34:55Z) - Pretraining is All You Need for Image-to-Image Translation [59.43151345732397]
We propose to use pretraining to boost general image-to-image translation.
We show that the proposed pretraining-based image-to-image translation (PITI) is capable of synthesizing images of unprecedented realism and faithfulness.
arXiv Detail & Related papers (2022-05-25T17:58:26Z) - AraBART: a Pretrained Arabic Sequence-to-Sequence Model for Abstractive
Summarization [23.540743628126837]
We propose AraBART, the first Arabic model in which the encoder and the decoder are pretrained end-to-end, based on BART.
We show that AraBART achieves the best performance on multiple abstractive summarization datasets.
arXiv Detail & Related papers (2022-03-21T13:11:41Z) - BARThez: a Skilled Pretrained French Sequence-to-Sequence Model [19.508391246171115]
We introduce BARThez, the first large-scale pretrained seq2seq model for French.
Being based on BART, BARThez is particularly well-suited for generative tasks.
We show BARThez to be very competitive with state-of-the-art BERT-based French language models.
arXiv Detail & Related papers (2020-10-23T11:57:33Z) - Recipes for Adapting Pre-trained Monolingual and Multilingual Models to
Machine Translation [50.0258495437314]
We investigate the benefits and drawbacks of freezing parameters, and adding new ones, when fine-tuning a pre-trained model on Machine Translation (MT)
For BART we get the best performance by freezing most of the model parameters, and adding extra positional embeddings.
For mBART we match or outperform the performance of naive fine-tuning for most language pairs with the encoder, and most of the decoder, frozen.
arXiv Detail & Related papers (2020-04-30T16:09:22Z) - PhoBERT: Pre-trained language models for Vietnamese [11.685916685552982]
We present PhoBERT, the first public large-scale monolingual language models pre-trained for Vietnamese.
Experimental results show that PhoBERT consistently outperforms the recent best pre-trained multilingual model XLM-R.
We release PhoBERT to facilitate future research and downstream applications for Vietnamese NLP.
arXiv Detail & Related papers (2020-03-02T10:21:17Z) - Multilingual Denoising Pre-training for Neural Machine Translation [132.66750663226287]
mBART is a sequence-to-sequence denoising auto-encoder pre-trained on large-scale monolingual corpora.
mBART is one of the first methods for pre-training a complete sequence-to-sequence model.
arXiv Detail & Related papers (2020-01-22T18:59:17Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.