Transformers and Transfer Learning for Improving Portuguese Semantic
Role Labeling
- URL: http://arxiv.org/abs/2101.01213v2
- Date: Wed, 6 Jan 2021 11:05:52 GMT
- Title: Transformers and Transfer Learning for Improving Portuguese Semantic
Role Labeling
- Authors: Sofia Oliveira and Daniel Loureiro and Al\'ipio Jorge
- Abstract summary: For low resource languages, and in particular for Portuguese, currently available SRL models are hindered by scarce training data.
We explore a model architecture with only a pre-trained BERT-based model, a linear layer, softmax and Viterbi decoding.
- Score: 2.9005223064604078
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: Semantic Role Labeling (SRL) is a core Natural Language Processing task. For
English, recent methods based on Transformer models have allowed for major
improvements over the previous state of the art. However, for low resource
languages, and in particular for Portuguese, currently available SRL models are
hindered by scarce training data. In this paper, we explore a model
architecture with only a pre-trained BERT-based model, a linear layer, softmax
and Viterbi decoding. We substantially improve the state of the art performance
in Portuguese by over 15$F_1$. Additionally, we improve SRL results in
Portuguese corpora by exploiting cross-lingual transfer learning using
multilingual pre-trained models (XLM-R), and transfer learning from dependency
parsing in Portuguese. We evaluate the various proposed approaches empirically
and as result we present an heuristic that supports the choice of the most
appropriate model considering the available resources.
Related papers
- Tucano: Advancing Neural Text Generation for Portuguese [0.0]
This study aims to introduce a new set of resources to stimulate the future development of neural text generation in Portuguese.
In this work, we document the development of GigaVerbo, a concatenation of deduplicated Portuguese text corpora amounting to 200 billion tokens.
Our models perform equal or superior to other Portuguese and multilingual language models of similar size in several Portuguese benchmarks.
arXiv Detail & Related papers (2024-11-12T15:06:06Z) - MoE-CT: A Novel Approach For Large Language Models Training With Resistance To Catastrophic Forgetting [53.77590764277568]
We introduce a novel MoE-CT architecture that separates the base model's learning from the multilingual expansion process.
Our design freezes the original LLM parameters, thus safeguarding its performance in high-resource languages, while an appended MoE module, trained on diverse language datasets, augments low-resource language proficiency.
arXiv Detail & Related papers (2024-06-25T11:03:45Z) - PORTULAN ExtraGLUE Datasets and Models: Kick-starting a Benchmark for the Neural Processing of Portuguese [1.2779732438508473]
We contribute a collection of datasets for an array of language processing tasks and a collection of fine-tuned neural language models on these downstream tasks.
To align with mainstream benchmarks in the literature, originally developed in English, the datasets were machine-translated from English with a state-of-the-art translation engine.
The resulting PORTULAN ExtraGLUE benchmark is a basis for research on Portuguese whose improvement can be pursued in future work.
arXiv Detail & Related papers (2024-04-08T09:22:41Z) - Gl\'orIA - A Generative and Open Large Language Model for Portuguese [4.782288068552145]
We introduce Gl'orIA, a robust European Portuguese decoder LLM.
To pre-train Gl'orIA, we assembled a comprehensive PT-PT text corpus comprising 35 billion tokens from various sources.
Evaluation shows that Gl'orIA significantly outperforms existing open PT decoder models in language modeling.
arXiv Detail & Related papers (2024-02-20T12:36:40Z) - Self-Augmentation Improves Zero-Shot Cross-Lingual Transfer [92.80671770992572]
Cross-lingual transfer is a central task in multilingual NLP.
Earlier efforts on this task use parallel corpora, bilingual dictionaries, or other annotated alignment data.
We propose a simple yet effective method, SALT, to improve the zero-shot cross-lingual transfer.
arXiv Detail & Related papers (2023-09-19T19:30:56Z) - Cross-lingual Transferring of Pre-trained Contextualized Language Models [73.97131976850424]
We propose a novel cross-lingual model transferring framework for PrLMs: TreLM.
To handle the symbol order and sequence length differences between languages, we propose an intermediate TRILayer" structure.
We show the proposed framework significantly outperforms language models trained from scratch with limited data in both performance and efficiency.
arXiv Detail & Related papers (2021-07-27T06:51:13Z) - UNKs Everywhere: Adapting Multilingual Language Models to New Scripts [103.79021395138423]
Massively multilingual language models such as multilingual BERT (mBERT) and XLM-R offer state-of-the-art cross-lingual transfer performance on a range of NLP tasks.
Due to their limited capacity and large differences in pretraining data, there is a profound performance gap between resource-rich and resource-poor target languages.
We propose novel data-efficient methods that enable quick and effective adaptation of pretrained multilingual models to such low-resource languages and unseen scripts.
arXiv Detail & Related papers (2020-12-31T11:37:28Z) - Pre-training Multilingual Neural Machine Translation by Leveraging
Alignment Information [72.2412707779571]
mRASP is an approach to pre-train a universal multilingual neural machine translation model.
We carry out experiments on 42 translation directions across a diverse setting, including low, medium, rich resource, and as well as transferring to exotic language pairs.
arXiv Detail & Related papers (2020-10-07T03:57:54Z) - MAD-X: An Adapter-Based Framework for Multi-Task Cross-Lingual Transfer [136.09386219006123]
We propose MAD-X, an adapter-based framework that enables high portability and parameter-efficient transfer to arbitrary tasks and languages.
MAD-X outperforms the state of the art in cross-lingual transfer across a representative set of typologically diverse languages on named entity recognition and causal commonsense reasoning.
arXiv Detail & Related papers (2020-04-30T18:54:43Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.