GreekBART: The First Pretrained Greek Sequence-to-Sequence Model
- URL: http://arxiv.org/abs/2304.00869v1
- Date: Mon, 3 Apr 2023 10:48:51 GMT
- Title: GreekBART: The First Pretrained Greek Sequence-to-Sequence Model
- Authors: Iakovos Evdaimon, Hadi Abdine, Christos Xypolopoulos, Stamatis
Outsios, Michalis Vazirgiannis, Giorgos Stamou
- Abstract summary: We introduce GreekBART, the first Seq2Seq model based on BART-base architecture and pretrained on a large-scale Greek corpus.
We evaluate and compare GreekBART against BART-random, Greek-BERT, and XLM-R on a variety of discriminative tasks.
- Score: 13.429669368275318
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: The era of transfer learning has revolutionized the fields of Computer Vision
and Natural Language Processing, bringing powerful pretrained models with
exceptional performance across a variety of tasks. Specifically, Natural
Language Processing tasks have been dominated by transformer-based language
models. In Natural Language Inference and Natural Language Generation tasks,
the BERT model and its variants, as well as the GPT model and its successors,
demonstrated exemplary performance. However, the majority of these models are
pretrained and assessed primarily for the English language or on a multilingual
corpus. In this paper, we introduce GreekBART, the first Seq2Seq model based on
BART-base architecture and pretrained on a large-scale Greek corpus. We
evaluate and compare GreekBART against BART-random, Greek-BERT, and XLM-R on a
variety of discriminative tasks. In addition, we examine its performance on two
NLG tasks from GreekSUM, a newly introduced summarization dataset for the Greek
language. The model, the code, and the new summarization dataset will be
publicly available.
Related papers
- GreekT5: A Series of Greek Sequence-to-Sequence Models for News
Summarization [0.0]
This paper proposes a series of novel TS models for Greek news articles.
The proposed models were thoroughly evaluated on the same dataset against GreekBART.
Our evaluation results reveal that most of the proposed models significantly outperform GreekBART on various evaluation metrics.
arXiv Detail & Related papers (2023-11-13T21:33:12Z) - Masked Autoencoders As The Unified Learners For Pre-Trained Sentence
Representation [77.47617360812023]
We extend the recently proposed MAE style pre-training strategy, RetroMAE, to support a wide variety of sentence representation tasks.
The first stage performs RetroMAE over generic corpora, like Wikipedia, BookCorpus, etc., from which the base model is learned.
The second stage takes place on domain-specific data, e.g., MS MARCO and NLI, where the base model is continuingly trained based on RetroMAE and contrastive learning.
arXiv Detail & Related papers (2022-07-30T14:34:55Z) - Interpreting Language Models Through Knowledge Graph Extraction [42.97929497661778]
We compare BERT-based language models through snapshots of acquired knowledge at sequential stages of the training process.
We present a methodology to unveil a knowledge acquisition timeline by generating knowledge graph extracts from cloze "fill-in-the-blank" statements.
We extend this analysis to a comparison of pretrained variations of BERT models (DistilBERT, BERT-base, RoBERTa)
arXiv Detail & Related papers (2021-11-16T15:18:01Z) - Multi-granular Legal Topic Classification on Greek Legislation [4.09134848993518]
We study the task of classifying legal texts written in the Greek language.
This is the first time the task of Greek legal text classification is considered in an open research project.
arXiv Detail & Related papers (2021-09-30T17:43:00Z) - Paraphrastic Representations at Scale [134.41025103489224]
We release trained models for English, Arabic, German, French, Spanish, Russian, Turkish, and Chinese languages.
We train these models on large amounts of data, achieving significantly improved performance from the original papers.
arXiv Detail & Related papers (2021-04-30T16:55:28Z) - Lattice-BERT: Leveraging Multi-Granularity Representations in Chinese
Pre-trained Language Models [62.41139712595334]
We propose a novel pre-training paradigm for Chinese -- Lattice-BERT.
We construct a lattice graph from the characters and words in a sentence and feed all these text units into transformers.
We show that our model can bring an average increase of 1.5% under the 12-layer setting.
arXiv Detail & Related papers (2021-04-15T02:36:49Z) - Learning Contextual Representations for Semantic Parsing with
Generation-Augmented Pre-Training [86.91380874390778]
We present Generation-Augmented Pre-training (GAP), that jointly learns representations of natural language utterances and table schemas by leveraging generation models to generate pre-train data.
Based on experimental results, neural semantics that leverage GAP MODEL obtain new state-of-the-art results on both SPIDER and CRITERIA-TO-generative benchmarks.
arXiv Detail & Related papers (2020-12-18T15:53:50Z) - Unsupervised Paraphrasing with Pretrained Language Models [85.03373221588707]
We propose a training pipeline that enables pre-trained language models to generate high-quality paraphrases in an unsupervised setting.
Our recipe consists of task-adaptation, self-supervision, and a novel decoding algorithm named Dynamic Blocking.
We show with automatic and human evaluations that our approach achieves state-of-the-art performance on both the Quora Question Pair and the ParaNMT datasets.
arXiv Detail & Related papers (2020-10-24T11:55:28Z) - Pre-training Polish Transformer-based Language Models at Scale [1.0312968200748118]
We present two language models for Polish based on the popular BERT architecture.
We describe our methodology for collecting the data, preparing the corpus, and pre-training the model.
We then evaluate our models on thirteen Polish linguistic tasks, and demonstrate improvements in eleven of them.
arXiv Detail & Related papers (2020-06-07T18:48:58Z) - ParsBERT: Transformer-based Model for Persian Language Understanding [0.7646713951724012]
This paper proposes a monolingual BERT for the Persian language (ParsBERT)
It shows its state-of-the-art performance compared to other architectures and multilingual models.
ParsBERT obtains higher scores in all datasets, including existing ones as well as composed ones.
arXiv Detail & Related papers (2020-05-26T05:05:32Z) - Optimus: Organizing Sentences via Pre-trained Modeling of a Latent Space [109.79957125584252]
Variational Autoencoder (VAE) can be both a powerful generative model and an effective representation learning framework for natural language.
In this paper, we propose the first large-scale language VAE model, Optimus.
arXiv Detail & Related papers (2020-04-05T06:20:18Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.