Evaluation of Transfer Learning for Polish with a Text-to-Text Model
- URL: http://arxiv.org/abs/2205.08808v1
- Date: Wed, 18 May 2022 09:17:14 GMT
- Title: Evaluation of Transfer Learning for Polish with a Text-to-Text Model
- Authors: Aleksandra Chrabrowa, {\L}ukasz Dragan, Karol Grzegorczyk, Dariusz
Kajtoch, Miko{\l}aj Koszowski, Robert Mroczkowski, Piotr Rybak
- Abstract summary: We introduce a new benchmark for assessing the quality of text-to-text models for Polish.
The benchmark consists of diverse tasks and datasets: KLEJ benchmark adapted for text-to-text, en-pl translation, summarization, and question answering.
We present plT5 - a general-purpose text-to-text model for Polish that can be fine-tuned on various Natural Language Processing (NLP) tasks with a single training objective.
- Score: 54.81823151748415
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: We introduce a new benchmark for assessing the quality of text-to-text models
for Polish. The benchmark consists of diverse tasks and datasets: KLEJ
benchmark adapted for text-to-text, en-pl translation, summarization, and
question answering. In particular, since summarization and question answering
lack benchmark datasets for the Polish language, we describe their construction
and make them publicly available. Additionally, we present plT5 - a
general-purpose text-to-text model for Polish that can be fine-tuned on various
Natural Language Processing (NLP) tasks with a single training objective.
Unsupervised denoising pre-training is performed efficiently by initializing
the model weights with a multi-lingual T5 (mT5) counterpart. We evaluate the
performance of plT5, mT5, Polish BART (plBART), and Polish GPT-2 (papuGaPT2).
The plT5 scores top on all of these tasks except summarization, where plBART is
best. In general (except for summarization), the larger the model, the better
the results. The encoder-decoder architectures prove to be better than the
decoder-only equivalent.
Related papers
- PL-MTEB: Polish Massive Text Embedding Benchmark [0.0]
Polish Massive Text Embedding Benchmark (PL-MTEB) is a benchmark for text embeddings in Polish.
PL-MTEB consists of 28 diverse NLP tasks from 5 task types.
arXiv Detail & Related papers (2024-05-16T14:33:39Z) - Multilingual E5 Text Embeddings: A Technical Report [63.503320030117145]
Three embedding models of different sizes are provided, offering a balance between the inference efficiency and embedding quality.
We introduce a new instruction-tuned embedding model, whose performance is on par with state-of-the-art, English-only models of similar sizes.
arXiv Detail & Related papers (2024-02-08T13:47:50Z) - Pre-Training to Learn in Context [138.0745138788142]
The ability of in-context learning is not fully exploited because language models are not explicitly trained to learn in context.
We propose PICL (Pre-training for In-Context Learning), a framework to enhance the language models' in-context learning ability.
Our experiments show that PICL is more effective and task-generalizable than a range of baselines, outperforming larger language models with nearly 4x parameters.
arXiv Detail & Related papers (2023-05-16T03:38:06Z) - Text Embeddings by Weakly-Supervised Contrastive Pre-training [98.31785569325402]
E5 is a family of state-of-the-art text embeddings that transfer well to a wide range of tasks.
E5 can be readily used as a general-purpose embedding model for any tasks requiring a single-vector representation of texts.
arXiv Detail & Related papers (2022-12-07T09:25:54Z) - This is the way: designing and compiling LEPISZCZE, a comprehensive NLP
benchmark for Polish [5.8090623549313944]
We introduce LEPISZCZE, a new, comprehensive benchmark for Polish NLP.
We use five datasets from the Polish benchmark and add eight novel datasets.
We provide insights and experiences learned while creating the benchmark for Polish as the blueprint to design similar benchmarks for other low-resourced languages.
arXiv Detail & Related papers (2022-11-23T16:51:09Z) - mT6: Multilingual Pretrained Text-to-Text Transformer with Translation
Pairs [51.67970832510462]
We improve multilingual text-to-text transfer Transformer with translation pairs (mT6)
We explore three cross-lingual text-to-text pre-training tasks, namely, machine translation, translation pair span corruption, and translation span corruption.
Experimental results show that the proposed mT6 improves cross-lingual transferability over mT5.
arXiv Detail & Related papers (2021-04-18T03:24:07Z) - A Survey of Recent Abstract Summarization Techniques [0.0]
We investigate the impact of pre-training models on several Wikipedia datasets in English and Indonesian language.
The most significant factors that influence ROUGE performance are coverage, density, and compression.
The T5-Large, the Pegasus-XSum, and the ProphetNet-CNNDM provide the best summarization.
arXiv Detail & Related papers (2021-04-15T20:01:34Z) - Leveraging ParsBERT and Pretrained mT5 for Persian Abstractive Text
Summarization [1.0742675209112622]
This paper introduces a novel dataset named pn-summary for Persian abstractive text summarization.
The models employed in this paper are mT5 and an encoder-decoder version of the ParsBERT model.
arXiv Detail & Related papers (2020-12-21T09:35:52Z) - mT5: A massively multilingual pre-trained text-to-text transformer [60.0210636815514]
"Text-to-Text Transfer Transformer" (T5) leveraged a unified text-to-text format and scale to attain state-of-the-art results on English-language NLP tasks.
We introduce mT5, a multilingual variant of T5 that was pre-trained on a new Common Crawl-based dataset covering 101 languages.
arXiv Detail & Related papers (2020-10-22T17:58:14Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.