Syntactic Inductive Bias in Transformer Language Models: Especially
Helpful for Low-Resource Languages?
- URL: http://arxiv.org/abs/2311.00268v1
- Date: Wed, 1 Nov 2023 03:32:46 GMT
- Title: Syntactic Inductive Bias in Transformer Language Models: Especially
Helpful for Low-Resource Languages?
- Authors: Luke Gessler, Nathan Schneider
- Abstract summary: A line of work on Transformer-based language models has attempted to use syntactic inductive bias to enhance the pretraining process.
We investigate whether these methods can compensate for data sparseness in low-resource languages.
We find that these syntactic inductive bias methods produce uneven results in low-resource settings.
- Score: 10.324936426012417
- License: http://creativecommons.org/licenses/by/4.0/
- Abstract: A line of work on Transformer-based language models such as BERT has
attempted to use syntactic inductive bias to enhance the pretraining process,
on the theory that building syntactic structure into the training process
should reduce the amount of data needed for training. But such methods are
often tested for high-resource languages such as English. In this work, we
investigate whether these methods can compensate for data sparseness in
low-resource languages, hypothesizing that they ought to be more effective for
low-resource languages. We experiment with five low-resource languages: Uyghur,
Wolof, Maltese, Coptic, and Ancient Greek. We find that these syntactic
inductive bias methods produce uneven results in low-resource settings, and
provide surprisingly little benefit in most cases.
Related papers
- Shortcomings of LLMs for Low-Resource Translation: Retrieval and Understanding are Both the Problem [4.830018386227]
This work investigates the in-context learning abilities of pretrained large language models (LLMs) when instructed to translate text from a low-resource language into a high-resource language as part of an automated machine translation pipeline.
We conduct a set of experiments translating Southern Quechua to Spanish and examine the informativity of various types of context retrieved from a constrained database of digitized pedagogical materials and parallel corpora.
arXiv Detail & Related papers (2024-06-21T20:02:22Z) - Investigating Neural Machine Translation for Low-Resource Languages: Using Bavarian as a Case Study [1.6819960041696331]
In this paper, we revisit state-of-the-art Neural Machine Translation techniques to develop automatic translation systems between German and Bavarian.
Our experiment entails applying Back-translation and Transfer Learning to automatically generate more training data and achieve higher translation performance.
Statistical significance results with Bonferroni correction show surprisingly high baseline systems, and that Back-translation leads to significant improvement.
arXiv Detail & Related papers (2024-04-12T06:16:26Z) - Transferring BERT Capabilities from High-Resource to Low-Resource
Languages Using Vocabulary Matching [1.746529892290768]
This work presents a novel approach to transfer BERT capabilities from high-resource to low-resource languages using vocabulary matching.
We conduct experiments on the Silesian and Kashubian languages and demonstrate the effectiveness of our approach to improve the performance of BERT models even when the target language has minimal training data.
arXiv Detail & Related papers (2024-02-22T09:49:26Z) - MoSECroT: Model Stitching with Static Word Embeddings for Crosslingual Zero-shot Transfer [50.40191599304911]
We introduce MoSECroT Model Stitching with Static Word Embeddings for Crosslingual Zero-shot Transfer.
In this paper, we present the first framework that leverages relative representations to construct a common space for the embeddings of a source language PLM and the static word embeddings of a target language.
We show that although our proposed framework is competitive with weak baselines when addressing MoSECroT, it fails to achieve competitive results compared with some strong baselines.
arXiv Detail & Related papers (2024-01-09T21:09:07Z) - Democratizing LLMs for Low-Resource Languages by Leveraging their English Dominant Abilities with Linguistically-Diverse Prompts [75.33019401706188]
Large language models (LLMs) are known to effectively perform tasks by simply observing few exemplars.
We propose to assemble synthetic exemplars from a diverse set of high-resource languages to prompt the LLMs to translate from any language into English.
Our unsupervised prompting method performs on par with supervised few-shot learning in LLMs of different sizes for translations between English and 13 Indic and 21 African low-resource languages.
arXiv Detail & Related papers (2023-06-20T08:27:47Z) - The Best of Both Worlds: Combining Human and Machine Translations for
Multilingual Semantic Parsing with Active Learning [50.320178219081484]
We propose an active learning approach that exploits the strengths of both human and machine translations.
An ideal utterance selection can significantly reduce the error and bias in the translated data.
arXiv Detail & Related papers (2023-05-22T05:57:47Z) - Improving Cross-lingual Information Retrieval on Low-Resource Languages
via Optimal Transport Distillation [21.057178077747754]
In this work, we propose OPTICAL: Optimal Transport distillation for low-resource Cross-lingual information retrieval.
By separating the cross-lingual knowledge from knowledge of query document matching, OPTICAL only needs bitext data for distillation training.
Experimental results show that, with minimal training data, OPTICAL significantly outperforms strong baselines on low-resource languages.
arXiv Detail & Related papers (2023-01-29T22:30:36Z) - An Exploration of Data Augmentation Techniques for Improving English to
Tigrinya Translation [21.636157115922693]
An effective method of generating auxiliary data is back-translation of target language sentences.
We present a case study of Tigrinya where we investigate several back-translation methods to generate synthetic source sentences.
arXiv Detail & Related papers (2021-03-31T03:31:09Z) - Emergent Communication Pretraining for Few-Shot Machine Translation [66.48990742411033]
We pretrain neural networks via emergent communication from referential games.
Our key assumption is that grounding communication on images---as a crude approximation of real-world environments---inductively biases the model towards learning natural languages.
arXiv Detail & Related papers (2020-11-02T10:57:53Z) - Comparison of Interactive Knowledge Base Spelling Correction Models for
Low-Resource Languages [81.90356787324481]
Spelling normalization for low resource languages is a challenging task because the patterns are hard to predict.
This work shows a comparison of a neural model and character language models with varying amounts on target language data.
Our usage scenario is interactive correction with nearly zero amounts of training examples, improving models as more data is collected.
arXiv Detail & Related papers (2020-10-20T17:31:07Z) - Building Low-Resource NER Models Using Non-Speaker Annotation [58.78968578460793]
Cross-lingual methods have had notable success in addressing these concerns.
We propose a complementary approach to building low-resource Named Entity Recognition (NER) models using non-speaker'' (NS) annotations.
We show that use of NS annotators produces results that are consistently on par or better than cross-lingual methods built on modern contextual representations.
arXiv Detail & Related papers (2020-06-17T03:24:38Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.