Exploring the State-of-the-Art Language Modeling Methods and Data
Augmentation Techniques for Multilingual Clause-Level Morphology
- URL: http://arxiv.org/abs/2211.01736v1
- Date: Thu, 3 Nov 2022 11:53:39 GMT
- Title: Exploring the State-of-the-Art Language Modeling Methods and Data
Augmentation Techniques for Multilingual Clause-Level Morphology
- Authors: Emre Can Acikgoz, Tilek Chubakov, M\"uge Kural, G\"ozde G\"ul
\c{S}ahin, Deniz Yuret
- Abstract summary: We present our work on all three parts of the shared task: inflection, reinflection, and analysis.
We mainly explore two approaches: Transformer models in combination with data augmentation, and exploiting the state-of-the-art language modeling techniques for morphological analysis.
Our methods achieved first place in each of the three tasks and outperforms mT5-baseline with 89% for inflection, 80% for reinflection and 12% for analysis.
- Score: 3.8498574327875947
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: This paper describes the KUIS-AI NLP team's submission for the 1$^{st}$
Shared Task on Multilingual Clause-level Morphology (MRL2022). We present our
work on all three parts of the shared task: inflection, reinflection, and
analysis. We mainly explore two approaches: Transformer models in combination
with data augmentation, and exploiting the state-of-the-art language modeling
techniques for morphological analysis. Data augmentation leads a remarkable
performance improvement for most of the languages in the inflection task.
Prefix-tuning on pretrained mGPT model helps us to adapt reinflection and
analysis tasks in a low-data setting. Additionally, we used pipeline
architectures using publicly available open source lemmatization tools and
monolingual BERT-based morphological feature classifiers for reinflection and
analysis tasks, respectively. While Transformer architectures with data
augmentation and pipeline architectures achieved the best results for
inflection and reinflection tasks, pipelines and prefix-tuning on mGPT received
the highest results for the analysis task. Our methods achieved first place in
each of the three tasks and outperforms mT5-baseline with ~89\% for inflection,
~80\% for reinflection and ~12\% for analysis. Our code
https://github.com/emrecanacikgoz/mrl2022 is publicly available.
Related papers
- Contextualization Distillation from Large Language Model for Knowledge
Graph Completion [51.126166442122546]
We introduce the Contextualization Distillation strategy, a plug-in-and-play approach compatible with both discriminative and generative KGC frameworks.
Our method begins by instructing large language models to transform compact, structural triplets into context-rich segments.
Comprehensive evaluations across diverse datasets and KGC techniques highlight the efficacy and adaptability of our approach.
arXiv Detail & Related papers (2024-01-28T08:56:49Z) - Mixture-of-Linguistic-Experts Adapters for Improving and Interpreting
Pre-trained Language Models [22.977852629450346]
We propose a method that combines two popular research areas by injecting linguistic structures into pre-trained language models.
In our approach, parallel adapter modules encoding different linguistic structures are combined using a novel Mixture-of-Linguistic-Experts architecture.
Our experiment results show that our approach can outperform state-of-the-art PEFT methods with a comparable number of parameters.
arXiv Detail & Related papers (2023-10-24T23:29:06Z) - A deep Natural Language Inference predictor without language-specific
training data [44.26507854087991]
We present a technique of NLP to tackle the problem of inference relation (NLI) between pairs of sentences in a target language of choice without a language-specific training dataset.
We exploit a generic translation dataset, manually translated, along with two instances of the same pre-trained model.
The model has been evaluated over machine translated Stanford NLI test dataset, machine translated Multi-Genre NLI test dataset, and manually translated RTE3-ITA test dataset.
arXiv Detail & Related papers (2023-09-06T10:20:59Z) - Extensive Evaluation of Transformer-based Architectures for Adverse Drug
Events Extraction [6.78974856327994]
Adverse Event (ADE) extraction is one of the core tasks in digital pharmacovigilance.
We evaluate 19 Transformer-based models for ADE extraction on informal texts.
At the end of our analyses, we identify a list of take-home messages that can be derived from the experimental data.
arXiv Detail & Related papers (2023-06-08T15:25:24Z) - Unified Model Learning for Various Neural Machine Translation [63.320005222549646]
Existing machine translation (NMT) studies mainly focus on developing dataset-specific models.
We propose a versatile'' model, i.e., the Unified Model Learning for NMT (UMLNMT) that works with data from different tasks.
OurNMT results in substantial improvements over dataset-specific models with significantly reduced model deployment costs.
arXiv Detail & Related papers (2023-05-04T12:21:52Z) - Multi-Scales Data Augmentation Approach In Natural Language Inference
For Artifacts Mitigation And Pre-Trained Model Optimization [0.0]
We provide a variety of techniques for analyzing and locating dataset artifacts inside the crowdsourced Stanford Natural Language Inference corpus.
To mitigate dataset artifacts, we employ a unique multi-scale data augmentation technique with two distinct frameworks.
Our combination method enhances our model's resistance to perturbation testing, enabling it to continuously outperform the pre-trained baseline.
arXiv Detail & Related papers (2022-12-16T23:37:44Z) - Visualizing the Relationship Between Encoded Linguistic Information and
Task Performance [53.223789395577796]
We study the dynamic relationship between the encoded linguistic information and task performance from the viewpoint of Pareto Optimality.
We conduct experiments on two popular NLP tasks, i.e., machine translation and language modeling, and investigate the relationship between several kinds of linguistic information and task performances.
Our empirical findings suggest that some syntactic information is helpful for NLP tasks whereas encoding more syntactic information does not necessarily lead to better performance.
arXiv Detail & Related papers (2022-03-29T19:03:10Z) - Incorporating Linguistic Knowledge for Abstractive Multi-document
Summarization [20.572283625521784]
We develop a neural network based abstractive multi-document summarization (MDS) model.
We process the dependency information into the linguistic-guided attention mechanism.
With the help of linguistic signals, sentence-level relations can be correctly captured.
arXiv Detail & Related papers (2021-09-23T08:13:35Z) - Improving Neural Machine Translation by Bidirectional Training [85.64797317290349]
We present a simple and effective pretraining strategy -- bidirectional training (BiT) for neural machine translation.
Specifically, we bidirectionally update the model parameters at the early stage and then tune the model normally.
Experimental results show that BiT pushes the SOTA neural machine translation performance across 15 translation tasks on 8 language pairs significantly higher.
arXiv Detail & Related papers (2021-09-16T07:58:33Z) - Mixed-Lingual Pre-training for Cross-lingual Summarization [54.4823498438831]
Cross-lingual Summarization aims at producing a summary in the target language for an article in the source language.
We propose a solution based on mixed-lingual pre-training that leverages both cross-lingual tasks like translation and monolingual tasks like masked language models.
Our model achieves an improvement of 2.82 (English to Chinese) and 1.15 (Chinese to English) ROUGE-1 scores over state-of-the-art results.
arXiv Detail & Related papers (2020-10-18T00:21:53Z) - Dynamic Data Selection and Weighting for Iterative Back-Translation [116.14378571769045]
We propose a curriculum learning strategy for iterative back-translation models.
We evaluate our models on domain adaptation, low-resource, and high-resource MT settings.
Experimental results demonstrate that our methods achieve improvements of up to 1.8 BLEU points over competitive baselines.
arXiv Detail & Related papers (2020-04-07T19:49:58Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.