Enhancing deep neural networks with morphological information
- URL: http://arxiv.org/abs/2011.12432v2
- Date: Tue, 1 Mar 2022 22:51:01 GMT
- Title: Enhancing deep neural networks with morphological information
- Authors: Matej Klemen, Luka Krsnik, Marko Robnik-\v{S}ikonja
- Abstract summary: We analyse the effect of adding morphological features to LSTM and BERT models.
Our results suggest that adding morphological features has mixed effects depending on the quality of features and the task.
- Score: 0.0
- License: http://creativecommons.org/licenses/by-sa/4.0/
- Abstract: Deep learning approaches are superior in NLP due to their ability to extract
informative features and patterns from languages. The two most successful
neural architectures are LSTM and transformers, used in large pretrained
language models such as BERT. While cross-lingual approaches are on the rise,
most current NLP techniques are designed and applied to English, and
less-resourced languages are lagging behind. In morphologically rich languages,
information is conveyed through morphology, e.g., through affixes modifying
stems of words. Existing neural approaches do not explicitly use the
information on word morphology. We analyse the effect of adding morphological
features to LSTM and BERT models. As a testbed, we use three tasks available in
many less-resourced languages: named entity recognition (NER), dependency
parsing (DP), and comment filtering (CF). We construct baselines involving LSTM
and BERT models, which we adjust by adding additional input in the form of part
of speech (POS) tags and universal features. We compare models across several
languages from different language families. Our results suggest that adding
morphological features has mixed effects depending on the quality of features
and the task. The features improve the performance of LSTM-based models on the
NER and DP tasks, while they do not benefit the performance on the CF task. For
BERT-based models, the morphological features only improve the performance on
DP when they are of high quality while not showing practical improvement when
they are predicted. Even for high-quality features, the improvements are less
pronounced in language-specific BERT variants compared to massively
multilingual BERT models. As in NER and CF datasets manually checked features
are not available, we only experiment with predicted features and find that
they do not cause any practical improvement in performance.
Related papers
- Why do language models perform worse for morphologically complex languages? [0.913127392774573]
We find new evidence for a performance gap between agglutinative and fusional languages.
We propose three possible causes for this performance gap: morphological alignment of tokenizers, tokenization quality, and disparities in dataset sizes and measurement.
Results suggest that no language is harder or easier for a language model to learn on the basis of its morphological typology.
arXiv Detail & Related papers (2024-11-21T15:06:51Z) - ML-SUPERB 2.0: Benchmarking Multilingual Speech Models Across Modeling Constraints, Languages, and Datasets [106.7760874400261]
This paper presents ML-SUPERB2.0, which is a new benchmark for evaluating pre-trained SSL and supervised speech models.
We find performance improvements over the setup of ML-SUPERB, but performance depends on the downstream model design.
Also, we find large performance differences between languages and datasets, suggesting the need for more targeted approaches.
arXiv Detail & Related papers (2024-06-12T21:01:26Z) - Improving Massively Multilingual ASR With Auxiliary CTC Objectives [40.10307386370194]
We introduce our work on improving performance on FLEURS, a 102-language open ASR benchmark.
We investigate techniques inspired from recent Connectionist Temporal Classification ( CTC) studies to help the model handle the large number of languages.
Our state-of-the-art systems using self-supervised models with the Conformer architecture improve over the results of prior work on FLEURS by a relative 28.4% CER.
arXiv Detail & Related papers (2023-02-24T18:59:51Z) - KinyaBERT: a Morphology-aware Kinyarwanda Language Model [1.2183405753834562]
Unsupervised sub-word tokenization methods are sub-optimal at handling morphologically rich languages.
We propose a simple yet effective two-tier BERT architecture that leverages a morphological analyzer and explicitly represents morphological compositionality.
We evaluate our proposed method on the low-resource morphologically rich Kinyarwanda language, naming the proposed model architecture KinyaBERT.
arXiv Detail & Related papers (2022-03-16T08:36:14Z) - NL-Augmenter: A Framework for Task-Sensitive Natural Language
Augmentation [91.97706178867439]
We present NL-Augmenter, a new participatory Python-based natural language augmentation framework.
We describe the framework and an initial set of 117 transformations and 23 filters for a variety of natural language tasks.
We demonstrate the efficacy of NL-Augmenter by using several of its transformations to analyze the robustness of popular natural language models.
arXiv Detail & Related papers (2021-12-06T00:37:59Z) - To Augment or Not to Augment? A Comparative Study on Text Augmentation
Techniques for Low-Resource NLP [0.0]
We investigate three categories of text augmentation methodologies which perform changes on the syntax.
We compare them on part-of-speech tagging, dependency parsing and semantic role labeling for a diverse set of language families.
Our results suggest that the augmentation techniques can further improve over strong baselines based on mBERT.
arXiv Detail & Related papers (2021-11-18T10:52:48Z) - Continual Learning in Multilingual NMT via Language-Specific Embeddings [92.91823064720232]
It consists in replacing the shared vocabulary with a small language-specific vocabulary and fine-tuning the new embeddings on the new language's parallel data.
Because the parameters of the original model are not modified, its performance on the initial languages does not degrade.
arXiv Detail & Related papers (2021-10-20T10:38:57Z) - UNKs Everywhere: Adapting Multilingual Language Models to New Scripts [103.79021395138423]
Massively multilingual language models such as multilingual BERT (mBERT) and XLM-R offer state-of-the-art cross-lingual transfer performance on a range of NLP tasks.
Due to their limited capacity and large differences in pretraining data, there is a profound performance gap between resource-rich and resource-poor target languages.
We propose novel data-efficient methods that enable quick and effective adaptation of pretrained multilingual models to such low-resource languages and unseen scripts.
arXiv Detail & Related papers (2020-12-31T11:37:28Z) - Pre-training Multilingual Neural Machine Translation by Leveraging
Alignment Information [72.2412707779571]
mRASP is an approach to pre-train a universal multilingual neural machine translation model.
We carry out experiments on 42 translation directions across a diverse setting, including low, medium, rich resource, and as well as transferring to exotic language pairs.
arXiv Detail & Related papers (2020-10-07T03:57:54Z) - ParsBERT: Transformer-based Model for Persian Language Understanding [0.7646713951724012]
This paper proposes a monolingual BERT for the Persian language (ParsBERT)
It shows its state-of-the-art performance compared to other architectures and multilingual models.
ParsBERT obtains higher scores in all datasets, including existing ones as well as composed ones.
arXiv Detail & Related papers (2020-05-26T05:05:32Z) - Cross-lingual, Character-Level Neural Morphological Tagging [57.0020906265213]
We train character-level recurrent neural taggers to predict morphological taggings for high-resource languages and low-resource languages together.
Learning joint character representations among multiple related languages successfully enables knowledge transfer from the high-resource languages to the low-resource ones, improving accuracy by up to 30% over a monolingual model.
arXiv Detail & Related papers (2017-08-30T08:14:34Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.