ViT5: Pretrained Text-to-Text Transformer for Vietnamese Language
Generation
- URL: http://arxiv.org/abs/2205.06457v1
- Date: Fri, 13 May 2022 06:08:35 GMT
- Title: ViT5: Pretrained Text-to-Text Transformer for Vietnamese Language
Generation
- Authors: Long Phan, Hieu Tran, Hieu Nguyen, Trieu H. Trinh
- Abstract summary: We present ViT5, a pretrained Transformer-based encoder-decoder model for the Vietnamese language.
With T5-style self-supervised pretraining, ViT5 is trained on a large corpus of high-quality and diverse Vietnamese texts.
- Score: 2.0302025541827247
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: We present ViT5, a pretrained Transformer-based encoder-decoder model for the
Vietnamese language. With T5-style self-supervised pretraining, ViT5 is trained
on a large corpus of high-quality and diverse Vietnamese texts. We benchmark
ViT5 on two downstream text generation tasks, Abstractive Text Summarization
and Named Entity Recognition. Although Abstractive Text Summarization has been
widely studied for the English language thanks to its rich and large source of
data, there has been minimal research into the same task in Vietnamese, a much
lower resource language. In this work, we perform exhaustive experiments on
both Vietnamese Abstractive Summarization and Named Entity Recognition,
validating the performance of ViT5 against many other pretrained
Transformer-based encoder-decoder models. Our experiments show that ViT5
significantly outperforms existing models and achieves state-of-the-art results
on Vietnamese Text Summarization. On the task of Named Entity Recognition, ViT5
is competitive against previous best results from pretrained encoder-based
Transformer models. Further analysis shows the importance of context length
during the self-supervised pretraining on downstream performance across
different settings.
Related papers
- ViHateT5: Enhancing Hate Speech Detection in Vietnamese With A Unified Text-to-Text Transformer Model [0.0]
We introduce ViHateT5, a T5-based model pre-trained on our proposed large-scale domain-specific dataset named VOZ-HSD.
By harnessing the power of a text-to-text architecture, ViHateT5 can tackle multiple tasks using a unified model and achieve state-of-the-art performance across all standard HSD benchmarks in Vietnamese.
arXiv Detail & Related papers (2024-05-23T03:31:50Z) - A Text-to-Text Model for Multilingual Offensive Language Identification [19.23565690468299]
This study presents the first pre-trained model with encoder-decoder architecture for offensive language identification with text-to-text transformers (T5)
Our pre-trained T5 model outperforms other transformer-based models fine-tuned for offensive language detection, such as fBERT and HateBERT, in multiple English benchmarks.
Following a similar approach, we also train the first multilingual pre-trained model for offensive language identification using mT5.
arXiv Detail & Related papers (2023-12-06T09:37:27Z) - Graphix-T5: Mixing Pre-Trained Transformers with Graph-Aware Layers for
Text-to-SQL Parsing [56.232873134174056]
One of the major challenges in text-to-text parsing is domain generalization, i.e., how to well generalize to unseen databases.
In this work, we explore ways to further augment the pre-trained text-to-text transformer model with specialized components for text-to-text parsing.
To this end, we propose a new architecture GRAPHIX-T5, augmented by some specially-designed graph-aware model with layers.
arXiv Detail & Related papers (2023-01-18T13:29:05Z) - T5lephone: Bridging Speech and Text Self-supervised Models for Spoken
Language Understanding via Phoneme level T5 [65.32642587901903]
We conduct extensive studies on how PLMs with different tokenization strategies affect spoken language understanding task.
We extend the idea to create T5lephone, a variant of T5 that is pretrained using phonemicized text.
arXiv Detail & Related papers (2022-11-01T17:00:23Z) - M-Adapter: Modality Adaptation for End-to-End Speech-to-Text Translation [66.92823764664206]
We propose M-Adapter, a novel Transformer-based module, to adapt speech representations to text.
While shrinking the speech sequence, M-Adapter produces features desired for speech-to-text translation.
Our experimental results show that our model outperforms a strong baseline by up to 1 BLEU.
arXiv Detail & Related papers (2022-07-03T04:26:53Z) - Evaluation of Transfer Learning for Polish with a Text-to-Text Model [54.81823151748415]
We introduce a new benchmark for assessing the quality of text-to-text models for Polish.
The benchmark consists of diverse tasks and datasets: KLEJ benchmark adapted for text-to-text, en-pl translation, summarization, and question answering.
We present plT5 - a general-purpose text-to-text model for Polish that can be fine-tuned on various Natural Language Processing (NLP) tasks with a single training objective.
arXiv Detail & Related papers (2022-05-18T09:17:14Z) - VieSum: How Robust Are Transformer-based Models on Vietnamese
Summarization? [1.1379578593538398]
We investigate the robustness of transformer-based encoder-decoder architectures for Vietnamese abstractive summarization.
We validate the performance of the methods on two Vietnamese datasets.
arXiv Detail & Related papers (2021-10-08T17:10:31Z) - mT6: Multilingual Pretrained Text-to-Text Transformer with Translation
Pairs [51.67970832510462]
We improve multilingual text-to-text transfer Transformer with translation pairs (mT6)
We explore three cross-lingual text-to-text pre-training tasks, namely, machine translation, translation pair span corruption, and translation span corruption.
Experimental results show that the proposed mT6 improves cross-lingual transferability over mT5.
arXiv Detail & Related papers (2021-04-18T03:24:07Z) - mT5: A massively multilingual pre-trained text-to-text transformer [60.0210636815514]
"Text-to-Text Transfer Transformer" (T5) leveraged a unified text-to-text format and scale to attain state-of-the-art results on English-language NLP tasks.
We introduce mT5, a multilingual variant of T5 that was pre-trained on a new Common Crawl-based dataset covering 101 languages.
arXiv Detail & Related papers (2020-10-22T17:58:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.