Improving Low-Resource Machine Translation via Cross-Linguistic Transfer from Typologically Similar High-Resource Languages
- URL: http://arxiv.org/abs/2501.00045v2
- Date: Fri, 29 Aug 2025 20:42:27 GMT
- Title: Improving Low-Resource Machine Translation via Cross-Linguistic Transfer from Typologically Similar High-Resource Languages
- Authors: Saughmon Boujkian,
- Abstract summary: This study examines the cross-linguistic effectiveness of transfer learning for low-resource machine translation.<n>We conduct experiments on five typologically diverse language pairs spanning distinct families.<n>Results show that transfer learning consistently improves translation quality across all pairs.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: This study examines the cross-linguistic effectiveness of transfer learning for low-resource machine translation by fine-tuning models initially trained on typologically similar high-resource languages, using limited data from the target low-resource language. We hypothesize that linguistic similarity enables efficient adaptation, reducing the need for extensive training data. To test this, we conduct experiments on five typologically diverse language pairs spanning distinct families: Semitic (Modern Standard Arabic to Levantine Arabic), Bantu (Hausa to Zulu), Romance (Spanish to Catalan), Slavic (Slovak to Macedonian), and a language isolate (Eastern Armenian to Western Armenian). Results show that transfer learning consistently improves translation quality across all pairs, confirming its applicability beyond closely related languages. As a secondary analysis, we vary key hyperparameters learning rate, batch size, number of epochs, and weight decay to ensure results are not dependent on a single configuration. We find that moderate batch sizes (e.g., 32) are often optimal for similar pairs, smaller sizes benefit less similar pairs, and excessively high learning rates can destabilize training. These findings provide empirical evidence for the generalizability of transfer learning across language families and offer practical guidance for building machine translation systems in low-resource settings with minimal tuning effort.
Related papers
- Cross-Linguistic Transfer in Multilingual NLP: The Role of Language Families and Morphology [0.0]
Cross-lingual transfer has become a crucial aspect of multilingual NLP.<n>This paper investigates cross-linguistic transfer through the lens of language families and morphology.
arXiv Detail & Related papers (2025-05-20T04:19:34Z) - A Comparative Study of Translation Bias and Accuracy in Multilingual Large Language Models for Cross-Language Claim Verification [1.566834021297545]
This study systematically evaluates translation bias and the effectiveness of Large Language Models for cross-lingual claim verification.
We investigate two distinct translation methods: pre-translation and self-translation.
Our findings reveal that low-resource languages exhibit significantly lower accuracy in direct inference due to underrepresentation.
arXiv Detail & Related papers (2024-10-14T09:02:42Z) - Targeted Multilingual Adaptation for Low-resource Language Families [17.212424929235624]
We study best practices for adapting a pre-trained model to a language family.
Our adapted models significantly outperform mono- and multilingual baselines.
Low-resource languages can be aggressively up-sampled during training at little detriment to performance in high-resource languages.
arXiv Detail & Related papers (2024-05-20T23:38:06Z) - Investigating Neural Machine Translation for Low-Resource Languages: Using Bavarian as a Case Study [1.6819960041696331]
In this paper, we revisit state-of-the-art Neural Machine Translation techniques to develop automatic translation systems between German and Bavarian.
Our experiment entails applying Back-translation and Transfer Learning to automatically generate more training data and achieve higher translation performance.
Statistical significance results with Bonferroni correction show surprisingly high baseline systems, and that Back-translation leads to significant improvement.
arXiv Detail & Related papers (2024-04-12T06:16:26Z) - On the Calibration of Multilingual Question Answering LLMs [57.296161186129545]
We benchmark the calibration of several multilingual Large Language Models (MLLMs) on a variety of Question Answering tasks.
We study different dimensions of calibration in in-distribution, out-of-distribution, and cross-lingual transfer settings.
For decoder-only LLMs such as LlaMa2, we additionally find that in-context learning improves confidence calibration on multilingual data.
arXiv Detail & Related papers (2023-11-15T03:29:02Z) - Optimal Transport Posterior Alignment for Cross-lingual Semantic Parsing [68.47787275021567]
Cross-lingual semantic parsing transfers parsing capability from a high-resource language (e.g., English) to low-resource languages with scarce training data.
We propose a new approach to cross-lingual semantic parsing by explicitly minimizing cross-lingual divergence between latent variables using Optimal Transport.
arXiv Detail & Related papers (2023-07-09T04:52:31Z) - T3L: Translate-and-Test Transfer Learning for Cross-Lingual Text
Classification [50.675552118811]
Cross-lingual text classification is typically built on large-scale, multilingual language models (LMs) pretrained on a variety of languages of interest.
We propose revisiting the classic "translate-and-test" pipeline to neatly separate the translation and classification stages.
arXiv Detail & Related papers (2023-06-08T07:33:22Z) - The Best of Both Worlds: Combining Human and Machine Translations for
Multilingual Semantic Parsing with Active Learning [50.320178219081484]
We propose an active learning approach that exploits the strengths of both human and machine translations.
An ideal utterance selection can significantly reduce the error and bias in the translated data.
arXiv Detail & Related papers (2023-05-22T05:57:47Z) - Scaling Laws for Multilingual Neural Machine Translation [45.620062316968976]
We study how increases in the model size affect the model performance and investigate the role of the training mixture composition on the scaling behavior.
We find that changing the weightings of the individual language pairs in the training mixture only affect the multiplicative factor of the scaling law.
We leverage our observations to predict the performance of multilingual models trained with any language weighting at any scale.
arXiv Detail & Related papers (2023-02-19T18:43:24Z) - Improving Cross-lingual Information Retrieval on Low-Resource Languages
via Optimal Transport Distillation [21.057178077747754]
In this work, we propose OPTICAL: Optimal Transport distillation for low-resource Cross-lingual information retrieval.
By separating the cross-lingual knowledge from knowledge of query document matching, OPTICAL only needs bitext data for distillation training.
Experimental results show that, with minimal training data, OPTICAL significantly outperforms strong baselines on low-resource languages.
arXiv Detail & Related papers (2023-01-29T22:30:36Z) - Language-Family Adapters for Low-Resource Multilingual Neural Machine
Translation [129.99918589405675]
Large multilingual models trained with self-supervision achieve state-of-the-art results in a wide range of natural language processing tasks.
Multilingual fine-tuning improves performance on low-resource languages but requires modifying the entire model and can be prohibitively expensive.
We propose training language-family adapters on top of mBART-50 to facilitate cross-lingual transfer.
arXiv Detail & Related papers (2022-09-30T05:02:42Z) - High-resource Language-specific Training for Multilingual Neural Machine
Translation [109.31892935605192]
We propose the multilingual translation model with the high-resource language-specific training (HLT-MT) to alleviate the negative interference.
Specifically, we first train the multilingual model only with the high-resource pairs and select the language-specific modules at the top of the decoder.
HLT-MT is further trained on all available corpora to transfer knowledge from high-resource languages to low-resource languages.
arXiv Detail & Related papers (2022-07-11T14:33:13Z) - Analyzing the Mono- and Cross-Lingual Pretraining Dynamics of
Multilingual Language Models [73.11488464916668]
This study investigates the dynamics of the multilingual pretraining process.
We probe checkpoints taken from throughout XLM-R pretraining, using a suite of linguistic tasks.
Our analysis shows that the model achieves high in-language performance early on, with lower-level linguistic skills acquired before more complex ones.
arXiv Detail & Related papers (2022-05-24T03:35:00Z) - Phylogeny-Inspired Adaptation of Multilingual Models to New Languages [43.62238334380897]
We show how we can use language phylogenetic information to improve cross-lingual transfer leveraging closely related languages.
We perform adapter-based training on languages from diverse language families (Germanic, Uralic, Tupian, Uto-Aztecan) and evaluate on both syntactic and semantic tasks.
arXiv Detail & Related papers (2022-05-19T15:49:19Z) - Geographical Distance Is The New Hyperparameter: A Case Study Of Finding
The Optimal Pre-trained Language For English-isiZulu Machine Translation [0.0]
This study explores the potential benefits of transfer learning in an English-isiZulu translation framework.
We gathered results from 8 different language corpora, including one multi-lingual corpus, and saw that isiXa-isiZulu outperformed all languages.
We also derived a new coefficient, Nasir's Geographical Distance Coefficient (NGDC) which provides an easy selection of languages for the pre-trained models.
arXiv Detail & Related papers (2022-05-17T20:41:25Z) - Discovering Representation Sprachbund For Multilingual Pre-Training [139.05668687865688]
We generate language representation from multilingual pre-trained models and conduct linguistic analysis.
We cluster all the target languages into multiple groups and name each group as a representation sprachbund.
Experiments are conducted on cross-lingual benchmarks and significant improvements are achieved compared to strong baselines.
arXiv Detail & Related papers (2021-09-01T09:32:06Z) - Adaptive Sparse Transformer for Multilingual Translation [18.017674093519332]
A known challenge of multilingual models is the negative language interference.
We propose an adaptive and sparse architecture for multilingual modeling.
Our model outperforms strong baselines in terms of translation quality without increasing the inference cost.
arXiv Detail & Related papers (2021-04-15T10:31:07Z) - Are Multilingual Models Effective in Code-Switching? [57.78477547424949]
We study the effectiveness of multilingual language models to understand their capability and adaptability to the mixed-language setting.
Our findings suggest that pre-trained multilingual models do not necessarily guarantee high-quality representations on code-switching.
arXiv Detail & Related papers (2021-03-24T16:20:02Z) - Pre-training Multilingual Neural Machine Translation by Leveraging
Alignment Information [72.2412707779571]
mRASP is an approach to pre-train a universal multilingual neural machine translation model.
We carry out experiments on 42 translation directions across a diverse setting, including low, medium, rich resource, and as well as transferring to exotic language pairs.
arXiv Detail & Related papers (2020-10-07T03:57:54Z) - Translation Artifacts in Cross-lingual Transfer Learning [51.66536640084888]
We show that machine translation can introduce subtle artifacts that have a notable impact in existing cross-lingual models.
In natural language inference, translating the premise and the hypothesis independently can reduce the lexical overlap between them.
We also improve the state-of-the-art in XNLI for the translate-test and zero-shot approaches by 4.3 and 2.8 points, respectively.
arXiv Detail & Related papers (2020-04-09T17:54:30Z) - Transfer learning and subword sampling for asymmetric-resource
one-to-many neural translation [14.116412358534442]
Methods for improving neural machine translation for low-resource languages are reviewed.
Tests are carried out on three artificially restricted translation tasks and one real-world task.
Experiments show positive effects especially for scheduled multi-task learning, denoising autoencoder, and subword sampling.
arXiv Detail & Related papers (2020-04-08T14:19:05Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.