Low-Resource Neural Machine Translation Using Recurrent Neural Networks and Transfer Learning: A Case Study on English-to-Igbo
- URL: http://arxiv.org/abs/2504.17252v1
- Date: Thu, 24 Apr 2025 05:02:26 GMT
- Title: Low-Resource Neural Machine Translation Using Recurrent Neural Networks and Transfer Learning: A Case Study on English-to-Igbo
- Authors: Ocheme Anthony Ekle, Biswarup Das,
- Abstract summary: We develop Neural Machine Translation (NMT) and Transformer-based transfer learning models for English-to-Igbo translation.<n>Our models are trained on a curated and benchmarked dataset compiled from Bible corpora, local news, Wikipedia articles, and Common Crawl.<n>With transfer learning, we observe a performance gain of +4.83 BLEU points, reaching an estimated translation accuracy of 70%.
- Score: 0.0
- License: http://arxiv.org/licenses/nonexclusive-distrib/1.0/
- Abstract: In this study, we develop Neural Machine Translation (NMT) and Transformer-based transfer learning models for English-to-Igbo translation - a low-resource African language spoken by over 40 million people across Nigeria and West Africa. Our models are trained on a curated and benchmarked dataset compiled from Bible corpora, local news, Wikipedia articles, and Common Crawl, all verified by native language experts. We leverage Recurrent Neural Network (RNN) architectures, including Long Short-Term Memory (LSTM) and Gated Recurrent Units (GRU), enhanced with attention mechanisms to improve translation accuracy. To further enhance performance, we apply transfer learning using MarianNMT pre-trained models within the SimpleTransformers framework. Our RNN-based system achieves competitive results, closely matching existing English-Igbo benchmarks. With transfer learning, we observe a performance gain of +4.83 BLEU points, reaching an estimated translation accuracy of 70%. These findings highlight the effectiveness of combining RNNs with transfer learning to address the performance gap in low-resource language translation tasks.
Related papers
- Understanding and Analyzing Model Robustness and Knowledge-Transfer in Multilingual Neural Machine Translation using TX-Ray [3.2771631221674333]
This research investigates how knowledge transfer across languages can enhance Neural Machine Translation (MNMT) in extremely low-resource settings.<n>We perform English-German, English-French, and English-Spanish translations, leveraging minimal parallel data to establish cross-lingual mappings.<n>We find sequential transfer learning outperforms baselines on a 40k parallel sentence corpus, showcasing its efficacy.
arXiv Detail & Related papers (2024-12-18T14:21:58Z) - Beyond MLE: Investigating SEARNN for Low-Resourced Neural Machine Translation [0.09459165957946088]
This project explored the potential of SEARNN to improve machine translation for low-resourced African languages.
Experiments conducted on translation for English to Igbo, French to ewe, and French to ghomala directions.
We proved that SEARNN is indeed a viable algorithm to effectively train RNNs on machine translation for low-resourced languages.
arXiv Detail & Related papers (2024-05-20T06:28:43Z) - Boosting Unsupervised Machine Translation with Pseudo-Parallel Data [2.900810893770134]
We propose a training strategy that relies on pseudo-parallel sentence pairs mined from monolingual corpora and synthetic sentence pairs back-translated from monolingual corpora.
We reach an improvement of up to 14.5 BLEU points (English to Ukrainian) over a baseline trained on back-translated data only.
arXiv Detail & Related papers (2023-10-22T10:57:12Z) - Reduce, Reuse, Recycle: Is Perturbed Data better than Other Language augmentation for Low Resource Self-Supervised Speech Models [48.44820587495038]
Self-supervised representation learning (SSRL) has demonstrated superior performance than supervised models for tasks including phoneme recognition.
Training SSRL models poses a challenge for low-resource languages where sufficient pre-training data may not be available.
We propose to use audio augmentation techniques, namely: pitch variation, noise addition, accented target language and other language speech to pre-train SSRL models in a low resource condition and evaluate phoneme recognition.
arXiv Detail & Related papers (2023-09-22T10:09:09Z) - Learning to Generalize to More: Continuous Semantic Augmentation for
Neural Machine Translation [50.54059385277964]
We present a novel data augmentation paradigm termed Continuous Semantic Augmentation (CsaNMT)
CsaNMT augments each training instance with an adjacency region that could cover adequate variants of literal expression under the same meaning.
arXiv Detail & Related papers (2022-04-14T08:16:28Z) - Towards Better Chinese-centric Neural Machine Translation for
Low-resource Languages [12.374365655284342]
Building a neural machine translation (NMT) system has become an urgent trend, especially in the low-resource setting.
Recent work tends to study NMT systems for low-resource languages centered on English, while few works focus on low-resource NMT systems centered on other languages such as Chinese.
We present the winner competition system that leverages monolingual word embeddings data enhancement, bilingual curriculum learning, and contrastive re-ranking.
arXiv Detail & Related papers (2022-04-09T01:05:37Z) - Parameter-Efficient Neural Reranking for Cross-Lingual and Multilingual
Retrieval [66.69799641522133]
State-of-the-art neural (re)rankers are notoriously data hungry.
Current approaches typically transfer rankers trained on English data to other languages and cross-lingual setups by means of multilingual encoders.
We show that two parameter-efficient approaches to cross-lingual transfer, namely Sparse Fine-Tuning Masks (SFTMs) and Adapters, allow for a more lightweight and more effective zero-shot transfer.
arXiv Detail & Related papers (2022-04-05T15:44:27Z) - Cross-lingual Transferring of Pre-trained Contextualized Language Models [73.97131976850424]
We propose a novel cross-lingual model transferring framework for PrLMs: TreLM.
To handle the symbol order and sequence length differences between languages, we propose an intermediate TRILayer" structure.
We show the proposed framework significantly outperforms language models trained from scratch with limited data in both performance and efficiency.
arXiv Detail & Related papers (2021-07-27T06:51:13Z) - Neural Machine Translation model for University Email Application [1.4731169524644787]
A state-of-the-art Sequence-to-Sequence Neural Network for ML -> EN and EN -> ML translations is compared with Google Translate.
The low BLEU score of Google Translation indicates that the application based regional models are better.
arXiv Detail & Related papers (2020-07-20T15:05:16Z) - Learning Source Phrase Representations for Neural Machine Translation [65.94387047871648]
We propose an attentive phrase representation generation mechanism which is able to generate phrase representations from corresponding token representations.
In our experiments, we obtain significant improvements on the WMT 14 English-German and English-French tasks on top of the strong Transformer baseline.
arXiv Detail & Related papers (2020-06-25T13:43:11Z) - It's Easier to Translate out of English than into it: Measuring Neural
Translation Difficulty by Cross-Mutual Information [90.35685796083563]
Cross-mutual information (XMI) is an asymmetric information-theoretic metric of machine translation difficulty.
XMI exploits the probabilistic nature of most neural machine translation models.
We present the first systematic and controlled study of cross-lingual translation difficulties using modern neural translation systems.
arXiv Detail & Related papers (2020-05-05T17:38:48Z) - Improving Massively Multilingual Neural Machine Translation and
Zero-Shot Translation [81.7786241489002]
Massively multilingual models for neural machine translation (NMT) are theoretically attractive, but often underperform bilingual models and deliver poor zero-shot translations.
We argue that multilingual NMT requires stronger modeling capacity to support language pairs with varying typological characteristics.
We propose random online backtranslation to enforce the translation of unseen training language pairs.
arXiv Detail & Related papers (2020-04-24T17:21:32Z)
This list is automatically generated from the titles and abstracts of the papers in this site.
This site does not guarantee the quality of this site (including all information) and is not responsible for any consequences.